A significant number of hotel bookings are called-off due to cancellations or no-shows. The typical reasons for cancellations include change of plans, scheduling conflicts, etc. This is often made easier by the option to do so free of charge or preferably at a low cost which is beneficial to hotel guests but it is a less desirable and possibly revenue-diminishing factor for hotels to deal with. Such losses are particularly high on last-minute cancellations.
The new technologies involving online booking channels have dramatically changed customers’ booking possibilities and behavior. This adds a further dimension to the challenge of how hotels handle cancellations, which are no longer limited to traditional booking and guest characteristics.
The cancellation of bookings impact a hotel on various fronts:
The increasing number of cancellations calls for a Machine Learning based solution that can help in predicting which booking is likely to be canceled. INN Hotels Group has a chain of hotels in Portugal, they are facing problems with the high number of booking cancellations and have reached out to your firm for data-driven solutions. You as a data scientist have to analyze the data provided to find which factors have a high influence on booking cancellations, build a predictive model that can predict which booking is going to be canceled in advance, and help in formulating profitable policies for cancellations and refunds.
The data contains the different attributes of customers' booking details. The detailed data dictionary is given below.
Data Dictionary
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings("ignore")
%matplotlib inline
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn import tree
# To build linear model for statistical analysis and prediction
import statsmodels.stats.api as sms
from statsmodels.stats.outliers_influence import variance_inflation_factor
import statsmodels.api as sm
from statsmodels.tools.tools import add_constant
# To tune different models
from sklearn.model_selection import GridSearchCV
# To get diferent metric scores
from sklearn.metrics import (
f1_score,
accuracy_score,
recall_score,
precision_score,
confusion_matrix,
roc_auc_score,
plot_confusion_matrix,
precision_recall_curve,
roc_curve,
make_scorer,
)
# this will help in making the Python code more structured automatically (good coding practice)
%load_ext nb_black
# loading the dataset
data = pd.read_csv("INNHotelsGroup.csv")
# checking the shape of the data
data.shape
print(f"The data contains {data.shape[0]} rows") # f-string
print(f"The data contains {data.shape[1]} columns") # f-string
The data contains 36275 rows The data contains 19 columns
# let's view a sample of the data
data.sample(15, random_state=23)
| Booking_ID | no_of_adults | no_of_children | no_of_weekend_nights | no_of_week_nights | type_of_meal_plan | required_car_parking_space | room_type_reserved | lead_time | arrival_year | arrival_month | arrival_date | market_segment_type | repeated_guest | no_of_previous_cancellations | no_of_previous_bookings_not_canceled | avg_price_per_room | no_of_special_requests | booking_status | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 17949 | INN17950 | 1 | 0 | 2 | 1 | Meal Plan 2 | 0 | Room_Type 1 | 122 | 2018 | 3 | 27 | Offline | 0 | 0 | 0 | 86.00 | 1 | Not_Canceled |
| 11766 | INN11767 | 2 | 0 | 1 | 1 | Meal Plan 1 | 0 | Room_Type 1 | 60 | 2017 | 10 | 12 | Offline | 0 | 0 | 0 | 65.00 | 1 | Not_Canceled |
| 32026 | INN32027 | 2 | 0 | 1 | 0 | Meal Plan 1 | 0 | Room_Type 1 | 11 | 2017 | 8 | 24 | Online | 0 | 0 | 0 | 90.00 | 1 | Not_Canceled |
| 26671 | INN26672 | 2 | 0 | 0 | 4 | Meal Plan 1 | 0 | Room_Type 1 | 270 | 2018 | 4 | 20 | Offline | 0 | 0 | 0 | 62.80 | 0 | Canceled |
| 21857 | INN21858 | 2 | 0 | 2 | 2 | Meal Plan 1 | 0 | Room_Type 1 | 219 | 2018 | 10 | 22 | Online | 0 | 0 | 0 | 90.95 | 2 | Not_Canceled |
| 30740 | INN30741 | 2 | 0 | 2 | 3 | Meal Plan 1 | 0 | Room_Type 4 | 26 | 2017 | 10 | 4 | Online | 0 | 0 | 0 | 79.85 | 1 | Canceled |
| 23706 | INN23707 | 2 | 0 | 2 | 3 | Not Selected | 0 | Room_Type 1 | 61 | 2018 | 7 | 23 | Online | 0 | 0 | 0 | 94.50 | 1 | Not_Canceled |
| 13197 | INN13198 | 2 | 0 | 0 | 2 | Meal Plan 1 | 0 | Room_Type 1 | 245 | 2018 | 6 | 17 | Offline | 0 | 0 | 0 | 75.00 | 0 | Canceled |
| 14078 | INN14079 | 3 | 0 | 0 | 3 | Meal Plan 1 | 0 | Room_Type 4 | 120 | 2018 | 6 | 2 | Online | 0 | 0 | 0 | 159.30 | 1 | Canceled |
| 7827 | INN07828 | 2 | 0 | 2 | 1 | Meal Plan 1 | 0 | Room_Type 1 | 88 | 2018 | 12 | 18 | Offline | 0 | 0 | 0 | 75.00 | 0 | Not_Canceled |
| 34632 | INN34633 | 2 | 0 | 1 | 3 | Not Selected | 0 | Room_Type 1 | 26 | 2018 | 10 | 3 | Online | 0 | 0 | 0 | 103.76 | 1 | Not_Canceled |
| 11748 | INN11749 | 3 | 0 | 1 | 4 | Meal Plan 2 | 0 | Room_Type 1 | 197 | 2018 | 12 | 21 | Offline | 0 | 0 | 0 | 160.60 | 0 | Not_Canceled |
| 34636 | INN34637 | 1 | 0 | 0 | 2 | Meal Plan 1 | 0 | Room_Type 1 | 4 | 2018 | 6 | 24 | Offline | 0 | 0 | 0 | 95.00 | 0 | Not_Canceled |
| 14320 | INN14321 | 2 | 1 | 0 | 3 | Meal Plan 1 | 0 | Room_Type 1 | 51 | 2018 | 12 | 6 | Online | 0 | 0 | 0 | 109.80 | 2 | Not_Canceled |
| 3183 | INN03184 | 2 | 0 | 1 | 2 | Meal Plan 1 | 0 | Room_Type 1 | 34 | 2017 | 10 | 12 | Online | 0 | 0 | 0 | 103.50 | 1 | Not_Canceled |
# creating a copy of the data so that original data remains unchanged
df = data.copy()
# checking for duplicate values in the data
df.duplicated().sum()
0
# checking column datatypes and number of non-null values
df.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 36275 entries, 0 to 36274 Data columns (total 19 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Booking_ID 36275 non-null object 1 no_of_adults 36275 non-null int64 2 no_of_children 36275 non-null int64 3 no_of_weekend_nights 36275 non-null int64 4 no_of_week_nights 36275 non-null int64 5 type_of_meal_plan 36275 non-null object 6 required_car_parking_space 36275 non-null int64 7 room_type_reserved 36275 non-null object 8 lead_time 36275 non-null int64 9 arrival_year 36275 non-null int64 10 arrival_month 36275 non-null int64 11 arrival_date 36275 non-null int64 12 market_segment_type 36275 non-null object 13 repeated_guest 36275 non-null int64 14 no_of_previous_cancellations 36275 non-null int64 15 no_of_previous_bookings_not_canceled 36275 non-null int64 16 avg_price_per_room 36275 non-null float64 17 no_of_special_requests 36275 non-null int64 18 booking_status 36275 non-null object dtypes: float64(1), int64(13), object(5) memory usage: 5.3+ MB
# checking for missing values in the data.
df.isnull().sum()
Booking_ID 0 no_of_adults 0 no_of_children 0 no_of_weekend_nights 0 no_of_week_nights 0 type_of_meal_plan 0 required_car_parking_space 0 room_type_reserved 0 lead_time 0 arrival_year 0 arrival_month 0 arrival_date 0 market_segment_type 0 repeated_guest 0 no_of_previous_cancellations 0 no_of_previous_bookings_not_canceled 0 avg_price_per_room 0 no_of_special_requests 0 booking_status 0 dtype: int64
# Statistical summary of the data
df.describe(include="all").T
| count | unique | top | freq | mean | std | min | 25% | 50% | 75% | max | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Booking_ID | 36275 | 36275 | INN04089 | 1 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| no_of_adults | 36275.0 | NaN | NaN | NaN | 1.844962 | 0.518715 | 0.0 | 2.0 | 2.0 | 2.0 | 4.0 |
| no_of_children | 36275.0 | NaN | NaN | NaN | 0.105279 | 0.402648 | 0.0 | 0.0 | 0.0 | 0.0 | 10.0 |
| no_of_weekend_nights | 36275.0 | NaN | NaN | NaN | 0.810724 | 0.870644 | 0.0 | 0.0 | 1.0 | 2.0 | 7.0 |
| no_of_week_nights | 36275.0 | NaN | NaN | NaN | 2.2043 | 1.410905 | 0.0 | 1.0 | 2.0 | 3.0 | 17.0 |
| type_of_meal_plan | 36275 | 4 | Meal Plan 1 | 27835 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| required_car_parking_space | 36275.0 | NaN | NaN | NaN | 0.030986 | 0.173281 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 |
| room_type_reserved | 36275 | 7 | Room_Type 1 | 28130 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| lead_time | 36275.0 | NaN | NaN | NaN | 85.232557 | 85.930817 | 0.0 | 17.0 | 57.0 | 126.0 | 443.0 |
| arrival_year | 36275.0 | NaN | NaN | NaN | 2017.820427 | 0.383836 | 2017.0 | 2018.0 | 2018.0 | 2018.0 | 2018.0 |
| arrival_month | 36275.0 | NaN | NaN | NaN | 7.423653 | 3.069894 | 1.0 | 5.0 | 8.0 | 10.0 | 12.0 |
| arrival_date | 36275.0 | NaN | NaN | NaN | 15.596995 | 8.740447 | 1.0 | 8.0 | 16.0 | 23.0 | 31.0 |
| market_segment_type | 36275 | 5 | Online | 23214 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| repeated_guest | 36275.0 | NaN | NaN | NaN | 0.025637 | 0.158053 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 |
| no_of_previous_cancellations | 36275.0 | NaN | NaN | NaN | 0.023349 | 0.368331 | 0.0 | 0.0 | 0.0 | 0.0 | 13.0 |
| no_of_previous_bookings_not_canceled | 36275.0 | NaN | NaN | NaN | 0.153411 | 1.754171 | 0.0 | 0.0 | 0.0 | 0.0 | 58.0 |
| avg_price_per_room | 36275.0 | NaN | NaN | NaN | 103.423539 | 35.089424 | 0.0 | 80.3 | 99.45 | 120.0 | 540.0 |
| no_of_special_requests | 36275.0 | NaN | NaN | NaN | 0.619655 | 0.786236 | 0.0 | 0.0 | 0.0 | 1.0 | 5.0 |
| booking_status | 36275 | 2 | Not_Canceled | 24390 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
df["booking_status"].value_counts()
Not_Canceled 24390 Canceled 11885 Name: booking_status, dtype: int64
for feature in df.columns: # Loop through all columns in the dataframe
if df[feature].dtype == "object": # Only apply for columns with object strings
df[feature] = pd.Categorical(df[feature]) # Replace object with categorical
df.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 36275 entries, 0 to 36274 Data columns (total 19 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Booking_ID 36275 non-null category 1 no_of_adults 36275 non-null int64 2 no_of_children 36275 non-null int64 3 no_of_weekend_nights 36275 non-null int64 4 no_of_week_nights 36275 non-null int64 5 type_of_meal_plan 36275 non-null category 6 required_car_parking_space 36275 non-null int64 7 room_type_reserved 36275 non-null category 8 lead_time 36275 non-null int64 9 arrival_year 36275 non-null int64 10 arrival_month 36275 non-null int64 11 arrival_date 36275 non-null int64 12 market_segment_type 36275 non-null category 13 repeated_guest 36275 non-null int64 14 no_of_previous_cancellations 36275 non-null int64 15 no_of_previous_bookings_not_canceled 36275 non-null int64 16 avg_price_per_room 36275 non-null float64 17 no_of_special_requests 36275 non-null int64 18 booking_status 36275 non-null category dtypes: category(5), float64(1), int64(13) memory usage: 5.4 MB
for feature in df.columns: # Loop through all columns in the dataframe
if df[feature].dtype != "int64": # Only apply for columns with categorical strings
print(df[feature].value_counts())
INN00001 1
INN24187 1
INN24181 1
INN24182 1
INN24183 1
..
INN12086 1
INN12085 1
INN12084 1
INN12083 1
INN36275 1
Name: Booking_ID, Length: 36275, dtype: int64
Meal Plan 1 27835
Not Selected 5130
Meal Plan 2 3305
Meal Plan 3 5
Name: type_of_meal_plan, dtype: int64
Room_Type 1 28130
Room_Type 4 6057
Room_Type 6 966
Room_Type 2 692
Room_Type 5 265
Room_Type 7 158
Room_Type 3 7
Name: room_type_reserved, dtype: int64
Online 23214
Offline 10528
Corporate 2017
Complementary 391
Aviation 125
Name: market_segment_type, dtype: int64
65.00 848
75.00 826
90.00 703
95.00 669
115.00 662
...
139.88 1
82.59 1
126.69 1
108.35 1
178.33 1
Name: avg_price_per_room, Length: 3930, dtype: int64
Not_Canceled 24390
Canceled 11885
Name: booking_status, dtype: int64
Leading Questions:
# function to plot a boxplot and a histogram along the same scale.
def histogram_boxplot(data, feature, figsize=(12, 7), kde=False, bins=None):
"""
Boxplot and histogram combined
data: dataframe
feature: dataframe column
figsize: size of figure (default (12,7))
kde: whether to the show density curve (default False)
bins: number of bins for histogram (default None)
"""
f2, (ax_box2, ax_hist2) = plt.subplots(
nrows=2, # Number of rows of the subplot grid= 2
sharex=True, # x-axis will be shared among all subplots
gridspec_kw={"height_ratios": (0.25, 0.75)},
figsize=figsize,
) # creating the 2 subplots
sns.boxplot(
data=data, x=feature, ax=ax_box2, showmeans=True, color="violet"
) # boxplot will be created and a star will indicate the mean value of the column
sns.histplot(
data=data, x=feature, kde=kde, ax=ax_hist2, bins=bins, palette="winter"
) if bins else sns.histplot(
data=data, x=feature, kde=kde, ax=ax_hist2
) # For histogram
ax_hist2.axvline(
data[feature].mean(), color="green", linestyle="--"
) # Add mean to the histogram
ax_hist2.axvline(
data[feature].median(), color="black", linestyle="-"
) # Add median to the histogram
# function to create labeled barplots
def labeled_barplot(data, feature, perc=False, n=None):
"""
Barplot with percentage at the top
data: dataframe
feature: dataframe column
perc: whether to display percentages instead of count (default is False)
n: displays the top n category levels (default is None, i.e., display all levels)
"""
total = len(data[feature]) # length of the column
count = data[feature].nunique()
if n is None:
plt.figure(figsize=(count + 1, 5))
else:
plt.figure(figsize=(n + 1, 5))
plt.xticks(rotation=90, fontsize=15)
ax = sns.countplot(
data=data,
x=feature,
palette="Paired",
order=data[feature].value_counts().index[:n].sort_values(),
)
for p in ax.patches:
if perc == True:
label = "{:.1f}%".format(
100 * p.get_height() / total
) # percentage of each class of the category
else:
label = p.get_height() # count of each level of the category
x = p.get_x() + p.get_width() / 2 # width of the plot
y = p.get_height() # height of the plot
ax.annotate(
label,
(x, y),
ha="center",
va="center",
size=12,
xytext=(0, 5),
textcoords="offset points",
) # annotate the percentage
plt.show() # show the plot
histogram_boxplot(df, "no_of_adults")
labeled_barplot(df, "no_of_adults")
df.loc[df["no_of_adults"] == 0]
| Booking_ID | no_of_adults | no_of_children | no_of_weekend_nights | no_of_week_nights | type_of_meal_plan | required_car_parking_space | room_type_reserved | lead_time | arrival_year | arrival_month | arrival_date | market_segment_type | repeated_guest | no_of_previous_cancellations | no_of_previous_bookings_not_canceled | avg_price_per_room | no_of_special_requests | booking_status | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 32 | INN00033 | 0 | 2 | 0 | 3 | Meal Plan 1 | 0 | Room_Type 2 | 56 | 2018 | 12 | 7 | Online | 0 | 0 | 0 | 82.44 | 1 | Not_Canceled |
| 287 | INN00288 | 0 | 2 | 2 | 2 | Meal Plan 1 | 0 | Room_Type 1 | 68 | 2018 | 4 | 24 | Online | 0 | 0 | 0 | 108.38 | 1 | Canceled |
| 653 | INN00654 | 0 | 2 | 1 | 2 | Meal Plan 1 | 0 | Room_Type 2 | 78 | 2018 | 8 | 19 | Online | 0 | 0 | 0 | 115.68 | 1 | Not_Canceled |
| 937 | INN00938 | 0 | 2 | 0 | 3 | Meal Plan 1 | 0 | Room_Type 2 | 40 | 2018 | 1 | 14 | Online | 0 | 0 | 0 | 6.67 | 1 | Not_Canceled |
| 954 | INN00955 | 0 | 2 | 1 | 1 | Meal Plan 1 | 0 | Room_Type 2 | 92 | 2018 | 10 | 29 | Online | 0 | 0 | 0 | 81.50 | 2 | Not_Canceled |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 34720 | INN34721 | 0 | 2 | 0 | 3 | Meal Plan 1 | 0 | Room_Type 2 | 76 | 2018 | 9 | 21 | Online | 0 | 0 | 0 | 127.38 | 3 | Not_Canceled |
| 34730 | INN34731 | 0 | 2 | 1 | 1 | Meal Plan 1 | 0 | Room_Type 2 | 178 | 2018 | 8 | 27 | Online | 0 | 0 | 0 | 88.77 | 0 | Canceled |
| 34890 | INN34891 | 0 | 2 | 2 | 2 | Meal Plan 1 | 0 | Room_Type 2 | 31 | 2018 | 9 | 16 | Online | 0 | 0 | 0 | 124.25 | 2 | Not_Canceled |
| 35691 | INN35692 | 0 | 2 | 2 | 1 | Meal Plan 1 | 0 | Room_Type 2 | 75 | 2018 | 3 | 19 | Online | 0 | 0 | 0 | 78.00 | 0 | Canceled |
| 35797 | INN35798 | 0 | 2 | 0 | 2 | Meal Plan 1 | 0 | Room_Type 2 | 120 | 2018 | 6 | 30 | Online | 0 | 0 | 0 | 97.55 | 0 | Not_Canceled |
139 rows × 19 columns
histogram_boxplot(df, "no_of_children")
labeled_barplot(df, "no_of_children")
histogram_boxplot(df, "no_of_weekend_nights")
labeled_barplot(df, "no_of_weekend_nights")
histogram_boxplot(df, "no_of_week_nights")
labeled_barplot(df, "no_of_week_nights")
labeled_barplot(df, "type_of_meal_plan")
histogram_boxplot(df, "required_car_parking_space")
labeled_barplot(df, "required_car_parking_space")
labeled_barplot(df, "room_type_reserved")
histogram_boxplot(df, "lead_time")
histogram_boxplot(df, "arrival_year")
histogram_boxplot(df, "arrival_month")
histogram_boxplot(df, "arrival_date")
labeled_barplot(df, "market_segment_type")
histogram_boxplot(df, "repeated_guest")
labeled_barplot(df, "repeated_guest")
histogram_boxplot(df, "no_of_previous_cancellations")
labeled_barplot(df, "no_of_previous_cancellations")
histogram_boxplot(df, "no_of_previous_bookings_not_canceled")
df["no_of_previous_bookings_not_canceled"].value_counts()
0 35463 1 228 2 112 3 80 4 65 5 60 6 36 7 24 8 23 10 19 9 19 11 15 12 12 14 9 15 8 13 7 16 7 20 6 17 6 18 6 19 6 22 6 21 6 27 3 23 3 24 3 25 3 28 2 30 2 29 2 26 2 48 2 31 2 44 2 32 2 47 1 55 1 33 1 34 1 35 1 36 1 58 1 37 1 57 1 38 1 56 1 39 1 40 1 54 1 41 1 53 1 42 1 52 1 43 1 51 1 50 1 45 1 49 1 46 1 Name: no_of_previous_bookings_not_canceled, dtype: int64
histogram_boxplot(df, "avg_price_per_room")
min_price = df["avg_price_per_room"].min()
max_price = df["avg_price_per_room"].max()
print(f" Minimum price for rent a room per night is {min_price}")
print(f" Maximum price for rent a room per night is {max_price}")
Minimum price for rent a room per night is 0.0 Maximum price for rent a room per night is 540.0
histogram_boxplot(df, "no_of_special_requests")
df["no_of_special_requests"].value_counts()
0 19777 1 11373 2 4364 3 675 4 78 5 8 Name: no_of_special_requests, dtype: int64
labeled_barplot(df, "booking_status")
sns.set_style("darkgrid")
df.hist(figsize=(15, 10), bins=12)
plt.show()
### function to plot distributions wrt target
def distribution_plot_wrt_target(data, predictor, target):
fig, axs = plt.subplots(2, 2, figsize=(12, 10))
target_uniq = data[target].unique()
axs[0, 0].set_title("Distribution of target for target=" + str(target_uniq[0]))
sns.histplot(
data=data[data[target] == target_uniq[0]],
x=predictor,
kde=True,
ax=axs[0, 0],
color="teal",
stat="density",
)
axs[0, 1].set_title("Distribution of target for target=" + str(target_uniq[1]))
sns.histplot(
data=data[data[target] == target_uniq[1]],
x=predictor,
kde=True,
ax=axs[0, 1],
color="orange",
stat="density",
)
axs[1, 0].set_title("Boxplot w.r.t target")
sns.boxplot(data=data, x=target, y=predictor, ax=axs[1, 0], palette="gist_rainbow")
axs[1, 1].set_title("Boxplot (without outliers) w.r.t target")
sns.boxplot(
data=data,
x=target,
y=predictor,
ax=axs[1, 1],
showfliers=False,
palette="gist_rainbow",
)
plt.tight_layout()
plt.show()
def stacked_barplot(data, predictor, target):
"""
Print the category counts and plot a stacked bar chart
data: dataframe
predictor: independent variable
target: target variable
"""
count = data[predictor].nunique()
sorter = data[target].value_counts().index[-1]
tab1 = pd.crosstab(data[predictor], data[target], margins=True).sort_values(
by=sorter, ascending=False
)
print(tab1)
print("-" * 120)
tab = pd.crosstab(data[predictor], data[target], normalize="index").sort_values(
by=sorter, ascending=False
)
tab.plot(kind="bar", stacked=True, figsize=(count + 5, 5))
plt.legend(
loc="lower left", frameon=False,
)
plt.legend(loc="upper left", bbox_to_anchor=(1, 1))
plt.show()
plt.figure(figsize=(15, 7))
sns.heatmap(df.corr(), annot=True, vmin=-1, vmax=1)
plt.show()
repeted guests are correlated to the number of previous booking not canceled (As it is obvious).And also related to the number of cancelation.
repeated gusts are a littile related to the lead time.
df_pair = df.copy()
df_pair.drop(
[
"no_of_adults",
"no_of_children",
"no_of_weekend_nights",
"no_of_week_nights",
"arrival_date",
"no_of_special_requests",
"arrival_date",
"arrival_year",
"required_car_parking_space",
],
axis=1,
inplace=True,
)
sns.pairplot(df_pair, hue="booking_status")
plt.show()
plt.figure(figsize=(12, 6))
sns.barplot(
data=df, x="market_segment_type", y="avg_price_per_room", hue="booking_status"
)
plt.legend(loc="upper left")
plt.show()
plt.figure(figsize=(12, 6))
sns.countplot(data=df, x="market_segment_type", hue="booking_status")
plt.show()
stacked_barplot(data, "market_segment_type", "booking_status")
booking_status Canceled Not_Canceled All market_segment_type All 11885 24390 36275 Online 8475 14739 23214 Offline 3153 7375 10528 Corporate 220 1797 2017 Aviation 37 88 125 Complementary 0 391 391 ------------------------------------------------------------------------------------------------------------------------
plt.figure(figsize=(12, 6))
sns.boxplot(data=data, x="market_segment_type", y="avg_price_per_room")
plt.show()
plt.figure(figsize=(12, 6))
sns.countplot(data=df, x="no_of_children", hue="booking_status")
plt.show()
stacked_barplot(data, "no_of_children", "booking_status")
booking_status Canceled Not_Canceled All no_of_children All 11885 24390 36275 0 10882 22695 33577 1 540 1078 1618 2 457 601 1058 3 5 14 19 9 1 1 2 10 0 1 1 ------------------------------------------------------------------------------------------------------------------------
plt.figure(figsize=(12, 6))
sns.countplot(data=df, x="required_car_parking_space", hue="booking_status")
plt.show()
stacked_barplot(data, "required_car_parking_space", "booking_status")
booking_status Canceled Not_Canceled All required_car_parking_space All 11885 24390 36275 0 11771 23380 35151 1 114 1010 1124 ------------------------------------------------------------------------------------------------------------------------
plt.figure(figsize=(12, 6))
sns.countplot(data=df, x="no_of_special_requests", hue="booking_status")
plt.show()
stacked_barplot(data, "no_of_special_requests", "booking_status")
booking_status Canceled Not_Canceled All no_of_special_requests All 11885 24390 36275 0 8545 11232 19777 1 2703 8670 11373 2 637 3727 4364 3 0 675 675 4 0 78 78 5 0 8 8 ------------------------------------------------------------------------------------------------------------------------
stacked_barplot(data, "no_of_adults", "booking_status")
booking_status Canceled Not_Canceled All no_of_adults All 11885 24390 36275 2 9119 16989 26108 1 1856 5839 7695 3 863 1454 2317 0 44 95 139 4 3 13 16 ------------------------------------------------------------------------------------------------------------------------
stacked_barplot(data, "repeated_guest", "booking_status")
booking_status Canceled Not_Canceled All repeated_guest All 11885 24390 36275 0 11869 23476 35345 1 16 914 930 ------------------------------------------------------------------------------------------------------------------------
stacked_barplot(data, "arrival_month", "booking_status")
booking_status Canceled Not_Canceled All arrival_month All 11885 24390 36275 10 1880 3437 5317 9 1538 3073 4611 8 1488 2325 3813 7 1314 1606 2920 6 1291 1912 3203 4 995 1741 2736 5 948 1650 2598 11 875 2105 2980 3 700 1658 2358 2 430 1274 1704 12 402 2619 3021 1 24 990 1014 ------------------------------------------------------------------------------------------------------------------------
df1 = df.copy() # Make a copy of the data
df1.isnull().sum()
Booking_ID 0 no_of_adults 0 no_of_children 0 no_of_weekend_nights 0 no_of_week_nights 0 type_of_meal_plan 0 required_car_parking_space 0 room_type_reserved 0 lead_time 0 arrival_year 0 arrival_month 0 arrival_date 0 market_segment_type 0 repeated_guest 0 no_of_previous_cancellations 0 no_of_previous_bookings_not_canceled 0 avg_price_per_room 0 no_of_special_requests 0 booking_status 0 dtype: int64
df1.drop(
["Booking_ID"], axis=1, inplace=True
) # drop booking_id column as this information
df1.sample(5)
| no_of_adults | no_of_children | no_of_weekend_nights | no_of_week_nights | type_of_meal_plan | required_car_parking_space | room_type_reserved | lead_time | arrival_year | arrival_month | arrival_date | market_segment_type | repeated_guest | no_of_previous_cancellations | no_of_previous_bookings_not_canceled | avg_price_per_room | no_of_special_requests | booking_status | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 31254 | 2 | 0 | 2 | 3 | Meal Plan 1 | 0 | Room_Type 1 | 10 | 2018 | 12 | 3 | Online | 0 | 0 | 0 | 89.59 | 0 | Not_Canceled |
| 3317 | 2 | 0 | 0 | 3 | Not Selected | 0 | Room_Type 1 | 37 | 2018 | 12 | 8 | Online | 0 | 0 | 0 | 88.00 | 0 | Canceled |
| 4843 | 2 | 0 | 1 | 2 | Meal Plan 1 | 0 | Room_Type 1 | 25 | 2018 | 10 | 17 | Offline | 0 | 0 | 0 | 85.00 | 0 | Not_Canceled |
| 13529 | 2 | 0 | 0 | 2 | Meal Plan 1 | 0 | Room_Type 1 | 16 | 2017 | 9 | 9 | Online | 0 | 0 | 0 | 105.00 | 2 | Not_Canceled |
| 33793 | 2 | 0 | 2 | 3 | Not Selected | 0 | Room_Type 1 | 200 | 2018 | 12 | 1 | Offline | 0 | 0 | 0 | 51.00 | 0 | Not_Canceled |
df1.booking_status = df1.booking_status.apply(lambda x: 1 if x == "Canceled" else 0)
df1["booking_status"] = df1["booking_status"].astype(float)
replaceStruct = {
"type_of_meal_plan": {
"Meal Plan 1": 1,
"Meal Plan 2": 2,
"Meal Plan 3": 3,
"Not Selected": -1,
},
}
df1 = df1.replace(replaceStruct)
dummy_col = ["room_type_reserved", "market_segment_type"]
df1 = pd.get_dummies(df1, columns=dummy_col)
df1.loc[df1["lead_time"] <= 30, "lead_time_category"] = 1
df1.loc[df1["lead_time"] > 30, "lead_time_category"] = 2
df1.loc[df1["lead_time"] > 60, "lead_time_category"] = 3
df1.loc[df1["lead_time"] > 90, "lead_time_category"] = 4
df1.loc[df1["lead_time"] > 120, "lead_time_category"] = 5
df1.loc[df1["lead_time"] > 150, "lead_time_category"] = 6
df1.loc[df1["lead_time"] > 180, "lead_time_category"] = 7
df1.loc[df1["lead_time"] > 210, "lead_time_category"] = 8
df1.loc[df1["lead_time"] > 240, "lead_time_category"] = 9
df1.loc[df1["lead_time"] > 270, "lead_time_category"] = 10
df1.loc[df1["lead_time"] > 300, "lead_time_category"] = 11
df1.loc[df1["lead_time"] > 330, "lead_time_category"] = 12
df1.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 36275 entries, 0 to 36274 Data columns (total 29 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 no_of_adults 36275 non-null int64 1 no_of_children 36275 non-null int64 2 no_of_weekend_nights 36275 non-null int64 3 no_of_week_nights 36275 non-null int64 4 type_of_meal_plan 36275 non-null int64 5 required_car_parking_space 36275 non-null int64 6 lead_time 36275 non-null int64 7 arrival_year 36275 non-null int64 8 arrival_month 36275 non-null int64 9 arrival_date 36275 non-null int64 10 repeated_guest 36275 non-null int64 11 no_of_previous_cancellations 36275 non-null int64 12 no_of_previous_bookings_not_canceled 36275 non-null int64 13 avg_price_per_room 36275 non-null float64 14 no_of_special_requests 36275 non-null int64 15 booking_status 36275 non-null float64 16 room_type_reserved_Room_Type 1 36275 non-null uint8 17 room_type_reserved_Room_Type 2 36275 non-null uint8 18 room_type_reserved_Room_Type 3 36275 non-null uint8 19 room_type_reserved_Room_Type 4 36275 non-null uint8 20 room_type_reserved_Room_Type 5 36275 non-null uint8 21 room_type_reserved_Room_Type 6 36275 non-null uint8 22 room_type_reserved_Room_Type 7 36275 non-null uint8 23 market_segment_type_Aviation 36275 non-null uint8 24 market_segment_type_Complementary 36275 non-null uint8 25 market_segment_type_Corporate 36275 non-null uint8 26 market_segment_type_Offline 36275 non-null uint8 27 market_segment_type_Online 36275 non-null uint8 28 lead_time_category 36275 non-null float64 dtypes: float64(3), int64(14), uint8(12) memory usage: 5.1 MB
df1.isnull().sum()
no_of_adults 0 no_of_children 0 no_of_weekend_nights 0 no_of_week_nights 0 type_of_meal_plan 0 required_car_parking_space 0 lead_time 0 arrival_year 0 arrival_month 0 arrival_date 0 repeated_guest 0 no_of_previous_cancellations 0 no_of_previous_bookings_not_canceled 0 avg_price_per_room 0 no_of_special_requests 0 booking_status 0 room_type_reserved_Room_Type 1 0 room_type_reserved_Room_Type 2 0 room_type_reserved_Room_Type 3 0 room_type_reserved_Room_Type 4 0 room_type_reserved_Room_Type 5 0 room_type_reserved_Room_Type 6 0 room_type_reserved_Room_Type 7 0 market_segment_type_Aviation 0 market_segment_type_Complementary 0 market_segment_type_Corporate 0 market_segment_type_Offline 0 market_segment_type_Online 0 lead_time_category 0 dtype: int64
numerical_col = df1.select_dtypes(include=np.number).columns.tolist()
plt.figure(figsize=(20, 30))
for i, variable in enumerate(numerical_col):
plt.subplot(6, 5, i + 1)
plt.boxplot(df1[variable], whis=1.5)
plt.tight_layout()
plt.title(variable)
plt.show()
# functions to treat outliers by flooring and capping
def treat_outliers(df, col):
"""
Treats outliers in a variable
df: dataframe
col: dataframe column
"""
Q1 = df[col].quantile(0.25) # 25th quantile
Q3 = df[col].quantile(0.75) # 75th quantile
IQR = Q3 - Q1
Lower_Whisker = Q1 - 1.5 * IQR
Upper_Whisker = Q3 + 1.5 * IQR
# all the values smaller than Lower_Whisker will be assigned the value of Lower_Whisker
# all the values greater than Upper_Whisker will be assigned the value of Upper_Whisker
df[col] = np.clip(df[col], Lower_Whisker, Upper_Whisker)
return df
def treat_outliers_all(df, col_list):
"""
Treat outliers in a list of variables
df: dataframe
col_list: list of dataframe columns
"""
for c in col_list:
df = treat_outliers(df, c)
return df
selected_col = [
"avg_price_per_room",
"lead_time",
"no_of_week_nights",
"no_of_weekend_nights",
]
df2 = treat_outliers_all(df1, selected_col)
numerical_col = df1.select_dtypes(include=np.number).columns.tolist()
plt.figure(figsize=(20, 30))
for i, variable in enumerate(numerical_col):
plt.subplot(6, 5, i + 1)
plt.boxplot(df2[variable], whis=1.5)
plt.tight_layout()
plt.title(variable)
plt.show()
plt.figure(figsize=(12, 6))
sns.countplot(data=df2, x="lead_time_category", hue="booking_status")
plt.show()
distribution_plot_wrt_target(df2, "lead_time", "booking_status")
numerical_col = df2.select_dtypes(include=np.number).columns.tolist()
plt.figure(figsize=(20, 20))
for i, variable in enumerate(numerical_col):
plt.subplot(6, 5, i + 1)
plt.hist(df2[variable])
plt.tight_layout()
plt.title(variable)
plt.show()
df2.drop(
["lead_time_category"], axis=1, inplace=True
) # drop lead_time_category column as this information is not used more.
X = df2.drop(["booking_status"], axis=1)
Y = df2["booking_status"]
# creating dummies
X = pd.get_dummies(X, drop_first=True)
# Splitting data in train and test sets
X_train, X_test, y_train, y_test = train_test_split(
X, Y, test_size=0.30, random_state=1, stratify=Y
)
print("Shape of Training set : ", X_train.shape)
print("Shape of test set : ", X_test.shape)
print("Percentage of classes in training set:")
print(y_train.value_counts(normalize=True))
print("Percentage of classes in test set:")
print(y_test.value_counts(normalize=True))
Shape of Training set : (25392, 27) Shape of test set : (10883, 27) Percentage of classes in training set: 0.0 0.672377 1.0 0.327623 Name: booking_status, dtype: float64 Percentage of classes in test set: 0.0 0.672333 1.0 0.327667 Name: booking_status, dtype: float64
accuracy_score(y_train, np.ones_like(y_train))
0.3276228733459357
recall_score(y_train, np.ones_like(y_train))
1.0
precision_score(y_train, np.ones_like(y_train))
0.3276228733459357
f1_score(y_train, np.ones_like(y_train))
0.4935481000266975
Both the cases are important as:
If we predict that a booking will not be canceled and the booking gets canceled then the hotel will lose resources and will have to bear additional costs of distribution channels.
If we predict that a booking will get canceled and the booking doesn't get canceled the hotel might not be able to provide satisfactory services to the customer by assuming that this booking will be canceled. This might damage the brand equity.
# defining a function to compute different metrics to check performance of a classification model built using statsmodels
def model_performance_classification_statsmodels(
model, predictors, target, threshold=0.5
):
"""
Function to compute different metrics to check classification model performance
model: classifier
predictors: independent variables
target: dependent variable
threshold: threshold for classifying the observation as class 1
"""
# checking which probabilities are greater than threshold
pred_temp = model.predict(predictors) > threshold
# rounding off the above values to get classes
pred = np.round(pred_temp)
acc = accuracy_score(target, pred) # to compute Accuracy
recall = recall_score(target, pred) # to compute Recall
precision = precision_score(target, pred) # to compute Precision
f1 = f1_score(target, pred) # to compute F1-score
# creating a dataframe of metrics
df_perf = pd.DataFrame(
{"Accuracy": acc, "Recall": recall, "Precision": precision, "F1": f1,},
index=[0],
)
return df_perf
# defining a function to plot the confusion_matrix of a classification model
def confusion_matrix_statsmodels(model, predictors, target, threshold=0.5):
"""
To plot the confusion_matrix with percentages
model: classifier
predictors: independent variables
target: dependent variable
threshold: threshold for classifying the observation as class 1
"""
y_pred = model.predict(predictors) > threshold
cm = confusion_matrix(target, y_pred)
labels = np.asarray(
[
["{0:0.0f}".format(item) + "\n{0:.2%}".format(item / cm.flatten().sum())]
for item in cm.flatten()
]
).reshape(2, 2)
plt.figure(figsize=(6, 4))
sns.heatmap(cm, annot=labels, fmt="")
plt.ylabel("True label")
plt.xlabel("Predicted label")
X = df2.drop(["booking_status"], axis=1)
Y = df2["booking_status"]
# adding a contstant to X variable
X = add_constant(X)
# creating dummies
X = pd.get_dummies(X, drop_first=True)
# Splitting data in train and test sets
X_train, X_test, y_train, y_test = train_test_split(
X, Y, test_size=0.30, random_state=1, stratify=Y
)
# fitting the model on training set
logit = sm.Logit(y_train, X_train)
lg = logit.fit()
Warning: Maximum number of iterations has been exceeded.
Current function value: 0.422920
Iterations: 35
C:\Anacoda\lib\site-packages\statsmodels\base\model.py:566: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
warnings.warn("Maximum Likelihood optimization failed to "
# let's print the logistic regression summary
print(lg.summary())
Logit Regression Results
==============================================================================
Dep. Variable: booking_status No. Observations: 25392
Model: Logit Df Residuals: 25366
Method: MLE Df Model: 25
Date: Fri, 25 Mar 2022 Pseudo R-squ.: 0.3313
Time: 18:48:29 Log-Likelihood: -10739.
converged: False LL-Null: -16060.
Covariance Type: nonrobust LLR p-value: 0.000
========================================================================================================
coef std err z P>|z| [0.025 0.975]
--------------------------------------------------------------------------------------------------------
const -636.8479 7.41e+05 -0.001 0.999 -1.45e+06 1.45e+06
no_of_adults 0.0414 0.038 1.098 0.272 -0.032 0.115
no_of_children 0.0834 0.061 1.378 0.168 -0.035 0.202
no_of_weekend_nights 0.1491 0.020 7.566 0.000 0.110 0.188
no_of_week_nights 0.0011 0.014 0.083 0.934 -0.026 0.028
type_of_meal_plan -0.0543 0.025 -2.172 0.030 -0.103 -0.005
required_car_parking_space -1.6519 0.138 -11.968 0.000 -1.922 -1.381
lead_time 0.0166 0.000 60.448 0.000 0.016 0.017
arrival_year 0.4204 0.059 7.101 0.000 0.304 0.536
arrival_month -0.0475 0.006 -7.311 0.000 -0.060 -0.035
arrival_date 0.0035 0.002 1.790 0.073 -0.000 0.007
repeated_guest -1.7140 0.701 -2.446 0.014 -3.087 -0.341
no_of_previous_cancellations 0.3378 0.102 3.326 0.001 0.139 0.537
no_of_previous_bookings_not_canceled -1.5389 0.952 -1.616 0.106 -3.405 0.328
avg_price_per_room 0.0201 0.001 25.854 0.000 0.019 0.022
no_of_special_requests -1.4885 0.030 -49.012 0.000 -1.548 -1.429
room_type_reserved_Room_Type 1 -90.6932 nan nan nan nan nan
room_type_reserved_Room_Type 2 -91.1131 nan nan nan nan nan
room_type_reserved_Room_Type 3 -89.5603 nan nan nan nan nan
room_type_reserved_Room_Type 4 -91.0142 nan nan nan nan nan
room_type_reserved_Room_Type 5 -91.3891 nan nan nan nan nan
room_type_reserved_Room_Type 6 -91.3312 nan nan nan nan nan
room_type_reserved_Room_Type 7 -91.5691 nan nan nan nan nan
market_segment_type_Aviation -123.4026 nan nan nan nan nan
market_segment_type_Complementary -140.3886 nan nan nan nan nan
market_segment_type_Corporate -124.3430 nan nan nan nan nan
market_segment_type_Offline -125.2120 nan nan nan nan nan
market_segment_type_Online -123.4804 nan nan nan nan nan
========================================================================================================
print("Training performance:")
model_performance_classification_statsmodels(lg, X_train, y_train)
Training performance:
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.806278 | 0.636014 | 0.736703 | 0.682666 |
Observations
Negative values of the coefficient shows that probability of customer being a defaulter decreases with the increase of corresponding attribute value.
Positive values of the coefficient show that that probability of customer being a defaulter increases with the increase of corresponding attribute value.
p-value of a variable indicates if the variable is significant or not. If we consider the significance level to be 0.05 (5%), then any variable with a p-value less than 0.05 would be considered significant.
But these variables might contain multicollinearity, which will affect the p-values.
We will have to remove multicollinearity from the data to get reliable coefficients and p-values.
There are different ways of detecting (or testing) multi-collinearity, one such way is the Variation Inflation Factor.
vif_series = pd.Series(
[variance_inflation_factor(X_train.values, i) for i in range(X_train.shape[1])],
index=X_train.columns,
dtype=float,
)
print(
"Series before feature selection: \n\n{}\n".format(
vif_series.sort_values(ascending=False)
)
)
C:\Anacoda\lib\site-packages\statsmodels\regression\linear_model.py:1715: RuntimeWarning: divide by zero encountered in double_scalars return 1 - self.ssr/self.centered_tss C:\Anacoda\lib\site-packages\statsmodels\stats\outliers_influence.py:193: RuntimeWarning: divide by zero encountered in double_scalars vif = 1. / (1. - r_squared_i)
Series before feature selection: market_segment_type_Online inf room_type_reserved_Room_Type 1 inf market_segment_type_Offline inf market_segment_type_Corporate inf market_segment_type_Complementary inf market_segment_type_Aviation inf room_type_reserved_Room_Type 7 inf room_type_reserved_Room_Type 6 inf room_type_reserved_Room_Type 5 inf room_type_reserved_Room_Type 4 inf room_type_reserved_Room_Type 3 inf room_type_reserved_Room_Type 2 inf no_of_children 1.999380 avg_price_per_room 1.896121 repeated_guest 1.750991 no_of_previous_bookings_not_canceled 1.570349 arrival_year 1.403503 type_of_meal_plan 1.399971 lead_time 1.370191 no_of_adults 1.348639 no_of_previous_cancellations 1.321870 arrival_month 1.272803 no_of_special_requests 1.248451 no_of_week_nights 1.088057 no_of_weekend_nights 1.051674 required_car_parking_space 1.034406 arrival_date 1.006699 const 0.000000 dtype: float64
X_train1 = X_train.drop(
[
"market_segment_type_Online",
"market_segment_type_Offline",
"market_segment_type_Corporate",
"market_segment_type_Complementary",
"market_segment_type_Aviation",
],
axis=1,
)
vif_series2 = pd.Series(
[variance_inflation_factor(X_train1.values, i) for i in range(X_train1.shape[1])],
index=X_train1.columns,
)
print(
"Series before feature selection: \n\n{}\n".format(
vif_series2.sort_values(ascending=False)
)
)
Series before feature selection: room_type_reserved_Room_Type 7 inf room_type_reserved_Room_Type 6 inf room_type_reserved_Room_Type 5 inf room_type_reserved_Room_Type 4 inf room_type_reserved_Room_Type 3 inf room_type_reserved_Room_Type 2 inf room_type_reserved_Room_Type 1 inf no_of_children 1.990595 avg_price_per_room 1.610823 repeated_guest 1.559362 no_of_previous_bookings_not_canceled 1.556637 arrival_year 1.389348 no_of_previous_cancellations 1.307243 lead_time 1.294502 no_of_adults 1.293707 arrival_month 1.265042 type_of_meal_plan 1.203315 no_of_special_requests 1.134035 no_of_week_nights 1.076705 no_of_weekend_nights 1.036531 required_car_parking_space 1.029296 arrival_date 1.006091 const 0.000000 dtype: float64
logit1 = sm.Logit(y_train, X_train1.astype(float))
lg1 = logit1.fit()
print("Training performance:")
model_performance_classification_statsmodels(lg1, X_train1, y_train)
Optimization terminated successfully.
Current function value: 0.448115
Iterations 15
Training performance:
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.788674 | 0.5812 | 0.719815 | 0.643123 |
X_train2 = X_train1.drop(
[
"room_type_reserved_Room_Type 1",
"room_type_reserved_Room_Type 2",
"room_type_reserved_Room_Type 3",
"room_type_reserved_Room_Type 4",
"room_type_reserved_Room_Type 5",
"room_type_reserved_Room_Type 6",
"room_type_reserved_Room_Type 7",
],
axis=1,
)
vif_series3 = pd.Series(
[variance_inflation_factor(X_train2.values, i) for i in range(X_train2.shape[1])],
index=X_train2.columns,
)
print(
"Series before feature selection: \n\n{}\n".format(
vif_series3.sort_values(ascending=False)
)
)
Series before feature selection: const 3.791137e+07 no_of_previous_bookings_not_canceled 1.554995e+00 repeated_guest 1.551673e+00 avg_price_per_room 1.452163e+00 arrival_year 1.372549e+00 no_of_previous_cancellations 1.306709e+00 arrival_month 1.263378e+00 lead_time 1.260392e+00 no_of_adults 1.224156e+00 type_of_meal_plan 1.170730e+00 no_of_children 1.138765e+00 no_of_special_requests 1.126482e+00 no_of_week_nights 1.065280e+00 no_of_weekend_nights 1.034507e+00 required_car_parking_space 1.028674e+00 arrival_date 1.005543e+00 dtype: float64
logit2 = sm.Logit(y_train, X_train2.astype(float))
lg2 = logit2.fit()
print("Training performance:")
model_performance_classification_statsmodels(lg2, X_train2, y_train)
Optimization terminated successfully.
Current function value: 0.449142
Iterations 16
Training performance:
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.78702 | 0.577233 | 0.717466 | 0.639755 |
Observations:
print(lg2.summary())
Logit Regression Results
==============================================================================
Dep. Variable: booking_status No. Observations: 25392
Model: Logit Df Residuals: 25376
Method: MLE Df Model: 15
Date: Fri, 25 Mar 2022 Pseudo R-squ.: 0.2899
Time: 18:48:36 Log-Likelihood: -11405.
converged: True LL-Null: -16060.
Covariance Type: nonrobust LLR p-value: 0.000
========================================================================================================
coef std err z P>|z| [0.025 0.975]
--------------------------------------------------------------------------------------------------------
const -1212.5748 115.100 -10.535 0.000 -1438.166 -986.984
no_of_adults 0.0919 0.035 2.651 0.008 0.024 0.160
no_of_children 0.0565 0.043 1.316 0.188 -0.028 0.141
no_of_weekend_nights 0.2033 0.019 10.740 0.000 0.166 0.240
no_of_week_nights 0.0307 0.013 2.359 0.018 0.005 0.056
type_of_meal_plan -0.3631 0.022 -16.310 0.000 -0.407 -0.319
required_car_parking_space -1.4072 0.134 -10.471 0.000 -1.671 -1.144
lead_time 0.0144 0.000 59.006 0.000 0.014 0.015
arrival_year 0.5991 0.057 10.503 0.000 0.487 0.711
arrival_month -0.0498 0.006 -7.911 0.000 -0.062 -0.037
arrival_date 0.0043 0.002 2.295 0.022 0.001 0.008
repeated_guest -1.4860 0.560 -2.652 0.008 -2.584 -0.388
no_of_previous_cancellations 0.3042 0.101 3.020 0.003 0.107 0.502
no_of_previous_bookings_not_canceled -1.7557 0.990 -1.773 0.076 -3.696 0.185
avg_price_per_room 0.0231 0.001 33.401 0.000 0.022 0.024
no_of_special_requests -1.1765 0.027 -42.887 0.000 -1.230 -1.123
========================================================================================================
Note: The above process can also be done manually by picking one variable at a time that has a high p-value, dropping it, and building a model again. But that might be a little tedious and using a loop will be more efficient.
# running a loop to drop variables with high p-value
# initial list of columns
cols = X_train2.columns.tolist()
# setting an initial max p-value
max_p_value = 1
while len(cols) > 0:
# defining the train set
X_train_aux = X_train2[cols]
# fitting the model
model = sm.Logit(y_train, X_train_aux).fit(disp=False)
# getting the p-values and the maximum p-value
p_values = model.pvalues
max_p_value = max(p_values)
# name of the variable with maximum p-value
feature_with_p_max = p_values.idxmax()
if max_p_value > 0.05:
cols.remove(feature_with_p_max)
else:
break
selected_features = cols
print(selected_features)
['const', 'no_of_adults', 'no_of_weekend_nights', 'no_of_week_nights', 'type_of_meal_plan', 'required_car_parking_space', 'lead_time', 'arrival_year', 'arrival_month', 'arrival_date', 'repeated_guest', 'no_of_previous_cancellations', 'avg_price_per_room', 'no_of_special_requests']
X_train3 = X_train2[selected_features]
logit3 = sm.Logit(y_train, X_train3.astype(float))
lg3 = logit3.fit()
print(lg3.summary())
Optimization terminated successfully.
Current function value: 0.449404
Iterations 11
Logit Regression Results
==============================================================================
Dep. Variable: booking_status No. Observations: 25392
Model: Logit Df Residuals: 25378
Method: MLE Df Model: 13
Date: Fri, 25 Mar 2022 Pseudo R-squ.: 0.2895
Time: 18:48:39 Log-Likelihood: -11411.
converged: True LL-Null: -16060.
Covariance Type: nonrobust LLR p-value: 0.000
================================================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------------------------
const -1202.1109 115.119 -10.442 0.000 -1427.741 -976.481
no_of_adults 0.0857 0.034 2.500 0.012 0.019 0.153
no_of_weekend_nights 0.2044 0.019 10.804 0.000 0.167 0.241
no_of_week_nights 0.0316 0.013 2.425 0.015 0.006 0.057
type_of_meal_plan -0.3643 0.022 -16.367 0.000 -0.408 -0.321
required_car_parking_space -1.4025 0.134 -10.450 0.000 -1.666 -1.139
lead_time 0.0144 0.000 59.125 0.000 0.014 0.015
arrival_year 0.5939 0.057 10.410 0.000 0.482 0.706
arrival_month -0.0501 0.006 -7.962 0.000 -0.062 -0.038
arrival_date 0.0043 0.002 2.274 0.023 0.001 0.008
repeated_guest -2.5366 0.453 -5.594 0.000 -3.425 -1.648
no_of_previous_cancellations 0.2363 0.077 3.083 0.002 0.086 0.386
avg_price_per_room 0.0234 0.001 35.833 0.000 0.022 0.025
no_of_special_requests -1.1753 0.027 -42.961 0.000 -1.229 -1.122
================================================================================================
print("Training performance:")
model_performance_classification_statsmodels(lg3, X_train3, y_train)
Training performance:
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.787138 | 0.577473 | 0.717658 | 0.639979 |
# converting coefficients to odds
odds = np.exp(lg3.params)
# finding the percentage change
perc_change_odds = (np.exp(lg3.params) - 1) * 100
# removing limit from number of columns to display
pd.set_option("display.max_columns", None)
# adding the odds to a dataframe
pd.DataFrame({"Odds": odds, "Change_odd%": perc_change_odds}, index=X_train3.columns).T
| const | no_of_adults | no_of_weekend_nights | no_of_week_nights | type_of_meal_plan | required_car_parking_space | lead_time | arrival_year | arrival_month | arrival_date | repeated_guest | no_of_previous_cancellations | avg_price_per_room | no_of_special_requests | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Odds | 0.0 | 1.089464 | 1.226745 | 1.032097 | 0.694660 | 0.245988 | 1.014492 | 1.811065 | 0.951089 | 1.004270 | 0.079137 | 1.266534 | 1.023714 | 0.308737 |
| Change_odd% | -100.0 | 8.946383 | 22.674519 | 3.209689 | -30.534044 | -75.401164 | 1.449175 | 81.106544 | -4.891070 | 0.426974 | -92.086280 | 26.653447 | 2.371355 | -69.126284 |
"avg_price_per_room": Holding all other features constant, 1 unit change in avg_price_per_room will increase the odds of Hotel cancelation by 1.02 times or a 2.3% increase in odds of having cancelation.
Interpretation for other attributes can be done similarly.
# creating confusion matrix
confusion_matrix_statsmodels(lg3, X_train3, y_train, threshold=0.5)
print("Training performance:")
log_reg_model_train_perf = model_performance_classification_statsmodels(
lg3, X_train3, y_train, threshold=0.5
)
log_reg_model_train_perf
Training performance:
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.787138 | 0.577473 | 0.717658 | 0.639979 |
logit_roc_auc_train = roc_auc_score(y_train, lg3.predict(X_train3))
fpr, tpr, thresholds = roc_curve(y_train, lg3.predict(X_train3))
plt.figure(figsize=(7, 5))
plt.plot(fpr, tpr, label="Logistic Regression (area = %0.2f)" % logit_roc_auc_train)
plt.plot([0, 1], [0, 1], "r--")
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel("False Positive Rate")
plt.ylabel("True Positive Rate")
plt.title("Receiver operating characteristic")
plt.legend(loc="lower right")
plt.show()
# Optimal threshold as per AUC-ROC curve
# The optimal cut off would be where tpr is high and fpr is low
fpr, tpr, thresholds = roc_curve(y_train, lg3.predict(X_train3))
optimal_idx = np.argmax(tpr - fpr)
optimal_threshold_auc_roc = thresholds[optimal_idx]
print(optimal_threshold_auc_roc)
0.3178870183397934
# creating confusion matrix
confusion_matrix_statsmodels(
lg3, X_train3, y_train, threshold=optimal_threshold_auc_roc
)
# checking model performance for this model
log_reg_model_train_perf_threshold_auc_roc = model_performance_classification_statsmodels(
lg3, X_train3, y_train, threshold=optimal_threshold_auc_roc
)
print("Training performance:")
log_reg_model_train_perf_threshold_auc_roc
Training performance:
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.766186 | 0.769564 | 0.614277 | 0.683208 |
y_scores = lg3.predict(X_train3)
prec, rec, tre = precision_recall_curve(y_train, y_scores,)
def plot_prec_recall_vs_tresh(precisions, recalls, thresholds):
plt.plot(thresholds, precisions[:-1], "b--", label="precision")
plt.plot(thresholds, recalls[:-1], "g--", label="recall")
plt.xlabel("Threshold")
plt.legend(loc="upper left")
plt.ylim([0, 1])
plt.figure(figsize=(10, 7))
plot_prec_recall_vs_tresh(prec, rec, tre)
plt.show()
# setting the threshold
optimal_threshold_curve = 0.42
# creating confusion matrix
confusion_matrix_statsmodels(lg3, X_train3, y_train, threshold=optimal_threshold_curve)
log_reg_model_train_perf_threshold_curve = model_performance_classification_statsmodels(
lg3, X_train3, y_train, threshold=optimal_threshold_curve
)
print("Training performance:")
log_reg_model_train_perf_threshold_curve
Training performance:
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.781191 | 0.660777 | 0.667841 | 0.66429 |
# training performance comparison
models_train_comp_df = pd.concat(
[
log_reg_model_train_perf.T,
log_reg_model_train_perf_threshold_auc_roc.T,
log_reg_model_train_perf_threshold_curve.T,
],
axis=1,
)
models_train_comp_df.columns = [
"Logistic Regression-default Threshold (0.5)",
"Logistic Regression-0.31 Threshold",
"Logistic Regression-0.42 Threshold",
]
print("Training performance comparison:")
models_train_comp_df
Training performance comparison:
| Logistic Regression-default Threshold (0.5) | Logistic Regression-0.31 Threshold | Logistic Regression-0.42 Threshold | |
|---|---|---|---|
| Accuracy | 0.787138 | 0.766186 | 0.781191 |
| Recall | 0.577473 | 0.769564 | 0.660777 |
| Precision | 0.717658 | 0.614277 | 0.667841 |
| F1 | 0.639979 | 0.683208 | 0.664290 |
Dropping the columns from the test set that were dropped from the training set
X_test3 = X_test[list(X_train3.columns)]
# creating confusion matrix
confusion_matrix_statsmodels(lg3, X_test3, y_test)
log_reg_model_test_perf = model_performance_classification_statsmodels(
lg3, X_test3, y_test
)
print("Test performance:")
log_reg_model_test_perf
Test performance:
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.784067 | 0.574313 | 0.711111 | 0.635433 |
logit_roc_auc_train = roc_auc_score(y_test, lg3.predict(X_test3))
fpr, tpr, thresholds = roc_curve(y_test, lg3.predict(X_test3))
plt.figure(figsize=(7, 5))
plt.plot(fpr, tpr, label="Logistic Regression (area = %0.2f)" % logit_roc_auc_train)
plt.plot([0, 1], [0, 1], "r--")
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel("False Positive Rate")
plt.ylabel("True Positive Rate")
plt.title("Receiver operating characteristic")
plt.legend(loc="lower right")
plt.show()
Using model with threshold=0.317
# creating confusion matrix
confusion_matrix_statsmodels(lg3, X_test3, y_test, threshold=optimal_threshold_auc_roc)
# checking model performance for this model
log_reg_model_test_perf_threshold_auc_roc = model_performance_classification_statsmodels(
lg3, X_test3, y_test, threshold=optimal_threshold_auc_roc
)
print("Test performance:")
log_reg_model_test_perf_threshold_auc_roc
Test performance:
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.759809 | 0.763881 | 0.605872 | 0.675763 |
Using model with threshold = 0.42
# creating confusion matrix
confusion_matrix_statsmodels(lg3, X_test3, y_test, threshold=optimal_threshold_curve)
log_reg_model_test_perf_threshold_curve = model_performance_classification_statsmodels(
lg3, X_test3, y_test, threshold=optimal_threshold_curve
)
print("Test performance:")
log_reg_model_test_perf_threshold_curve
Test performance:
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.776165 | 0.654515 | 0.659695 | 0.657095 |
# training performance comparison
models_train_comp_df = pd.concat(
[
log_reg_model_train_perf.T,
log_reg_model_train_perf_threshold_auc_roc.T,
log_reg_model_train_perf_threshold_curve.T,
],
axis=1,
)
models_train_comp_df.columns = [
"Logistic Regression-default Threshold (0.5)",
"Logistic Regression-0.31 Threshold",
"Logistic Regression-0.42 Threshold",
]
print("Training performance comparison:")
models_train_comp_df
Training performance comparison:
| Logistic Regression-default Threshold (0.5) | Logistic Regression-0.31 Threshold | Logistic Regression-0.42 Threshold | |
|---|---|---|---|
| Accuracy | 0.787138 | 0.766186 | 0.781191 |
| Recall | 0.577473 | 0.769564 | 0.660777 |
| Precision | 0.717658 | 0.614277 | 0.667841 |
| F1 | 0.639979 | 0.683208 | 0.664290 |
# testing performance comparison
models_test_comp_df = pd.concat(
[
log_reg_model_test_perf.T,
log_reg_model_test_perf_threshold_auc_roc.T,
log_reg_model_test_perf_threshold_curve.T,
],
axis=1,
)
models_test_comp_df.columns = [
"Logistic Regression-default Threshold (0.5)",
"Logistic Regression-0.31 Threshold",
"Logistic Regression-0.42 Threshold",
]
print("Test set performance comparison:")
models_test_comp_df
Test set performance comparison:
| Logistic Regression-default Threshold (0.5) | Logistic Regression-0.31 Threshold | Logistic Regression-0.42 Threshold | |
|---|---|---|---|
| Accuracy | 0.784067 | 0.759809 | 0.776165 |
| Recall | 0.574313 | 0.763881 | 0.654515 |
| Precision | 0.711111 | 0.605872 | 0.659695 |
| F1 | 0.635433 | 0.675763 | 0.657095 |
We have been able to build a predictive model that can be used by the INN Hotels to find that what is the chance of cancelation with f1 score of 0.65 with threshold of 0.42 and formulate it.
All the logistic regression models have given a generalized performance on the training and test set.
Using logistic regression method and playing with threshold parameret shows that the recall and precision did not increase a lot and so the result of f1 score is not too robust.
The data also was not balanced when trying logistic model on the data.
Finally some feature selection were done on the data to make the data more robust.
X = df2.drop(["booking_status"], axis=1)
Y = df2["booking_status"]
# Splitting data in train and test sets
X_train, X_test, y_train, y_test = train_test_split(
X, Y, test_size=0.30, random_state=1
)
First, let's create functions to calculate different metrics and confusion matrix so that we don't have to use the same code repeatedly for each model.
# defining a function to compute different metrics to check performance of a classification model built using sklearn
def model_performance_classification_sklearn(model, predictors, target):
"""
Function to compute different metrics to check classification model performance
model: classifier
predictors: independent variables
target: dependent variable
"""
# predicting using the independent variables
pred = model.predict(predictors)
acc = accuracy_score(target, pred) # to compute Accuracy
recall = recall_score(target, pred) # to compute Recall
precision = precision_score(target, pred) # to compute Precision
f1 = f1_score(target, pred) # to compute F1-score
# creating a dataframe of metrics
df_perf = pd.DataFrame(
{"Accuracy": acc, "Recall": recall, "Precision": precision, "F1": f1,},
index=[0],
)
return df_perf
def confusion_matrix_sklearn(model, predictors, target):
"""
To plot the confusion_matrix with percentages
model: classifier
predictors: independent variables
target: dependent variable
"""
y_pred = model.predict(predictors)
cm = confusion_matrix(target, y_pred)
labels = np.asarray(
[
["{0:0.0f}".format(item) + "\n{0:.2%}".format(item / cm.flatten().sum())]
for item in cm.flatten()
]
).reshape(2, 2)
plt.figure(figsize=(6, 4))
sns.heatmap(cm, annot=labels, fmt="")
plt.ylabel("True label")
plt.xlabel("Predicted label")
We will build our model using the DecisionTreeClassifier function. Using default 'gini' criteria to split. Other option include 'entropy'.
dTree = DecisionTreeClassifier(criterion="gini", random_state=1)
dTree.fit(X_train, y_train)
DecisionTreeClassifier(random_state=1)
confusion_matrix_sklearn(dTree, X_train, y_train)
decision_tree_perf_train = model_performance_classification_sklearn(
dTree, X_train, y_train
)
decision_tree_perf_train
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.994211 | 0.986608 | 0.995776 | 0.991171 |
confusion_matrix_sklearn(dTree, X_test, y_test)
decision_tree_perf_test = model_performance_classification_sklearn(
dTree, X_test, y_test
)
decision_tree_perf_test
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.872278 | 0.808348 | 0.79927 | 0.803783 |
feature_names = list(X.columns)
plt.figure(figsize=(20, 30))
tree.plot_tree(
dTree,
feature_names=feature_names,
filled=True,
fontsize=9,
node_ids=True,
class_names=True,
)
plt.show()
# Text report showing the rules of a decision tree -
print(tree.export_text(dTree, feature_names=feature_names, show_weights=True))
|--- lead_time <= 151.50 | |--- no_of_special_requests <= 0.50 | | |--- market_segment_type_Online <= 0.50 | | | |--- lead_time <= 90.50 | | | | |--- no_of_weekend_nights <= 0.50 | | | | | |--- avg_price_per_room <= 179.47 | | | | | | |--- market_segment_type_Offline <= 0.50 | | | | | | | |--- lead_time <= 16.50 | | | | | | | | |--- room_type_reserved_Room_Type 4 <= 0.50 | | | | | | | | | |--- repeated_guest <= 0.50 | | | | | | | | | | |--- lead_time <= 11.50 | | | | | | | | | | | |--- truncated branch of depth 13 | | | | | | | | | | |--- lead_time > 11.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- repeated_guest > 0.50 | | | | | | | | | | |--- weights: [147.00, 0.00] class: 0.0 | | | | | | | | |--- room_type_reserved_Room_Type 4 > 0.50 | | | | | | | | | |--- arrival_date <= 29.50 | | | | | | | | | | |--- no_of_adults <= 1.50 | | | | | | | | | | | |--- truncated branch of depth 8 | | | | | | | | | | |--- no_of_adults > 1.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- arrival_date > 29.50 | | | | | | | | | | |--- no_of_children <= 1.00 | | | | | | | | | | | |--- weights: [0.00, 3.00] class: 1.0 | | | | | | | | | | |--- no_of_children > 1.00 | | | | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | | | |--- lead_time > 16.50 | | | | | | | | |--- avg_price_per_room <= 135.00 | | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | | |--- lead_time <= 17.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- lead_time > 17.50 | | | | | | | | | | | |--- truncated branch of depth 12 | | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | | |--- weights: [29.00, 0.00] class: 0.0 | | | | | | | | |--- avg_price_per_room > 135.00 | | | | | | | | | |--- weights: [0.00, 8.00] class: 1.0 | | | | | | |--- market_segment_type_Offline > 0.50 | | | | | | | |--- weights: [1606.00, 0.00] class: 0.0 | | | | | |--- avg_price_per_room > 179.47 | | | | | | |--- arrival_date <= 25.50 | | | | | | | |--- room_type_reserved_Room_Type 1 <= 0.50 | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | | | |--- room_type_reserved_Room_Type 1 > 0.50 | | | | | | | | |--- weights: [0.00, 16.00] class: 1.0 | | | | | | |--- arrival_date > 25.50 | | | | | | | |--- room_type_reserved_Room_Type 1 <= 0.50 | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | | |--- room_type_reserved_Room_Type 1 > 0.50 | | | | | | | | |--- weights: [3.00, 0.00] class: 0.0 | | | | |--- no_of_weekend_nights > 0.50 | | | | | |--- lead_time <= 68.50 | | | | | | |--- no_of_weekend_nights <= 4.50 | | | | | | | |--- lead_time <= 1.50 | | | | | | | | |--- arrival_date <= 27.50 | | | | | | | | | |--- no_of_week_nights <= 5.50 | | | | | | | | | | |--- type_of_meal_plan <= 1.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | | |--- type_of_meal_plan > 1.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- no_of_week_nights > 5.50 | | | | | | | | | | |--- market_segment_type_Aviation <= 0.50 | | | | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | | | | | | |--- market_segment_type_Aviation > 0.50 | | | | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | | | |--- arrival_date > 27.50 | | | | | | | | | |--- no_of_weekend_nights <= 1.50 | | | | | | | | | | |--- no_of_week_nights <= 2.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- no_of_week_nights > 2.50 | | | | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | | | | |--- no_of_weekend_nights > 1.50 | | | | | | | | | | |--- arrival_month <= 4.50 | | | | | | | | | | | |--- weights: [0.00, 33.00] class: 1.0 | | | | | | | | | | |--- arrival_month > 4.50 | | | | | | | | | | | |--- weights: [2.00, 0.00] class: 0.0 | | | | | | | |--- lead_time > 1.50 | | | | | | | | |--- arrival_month <= 9.50 | | | | | | | | | |--- lead_time <= 59.50 | | | | | | | | | | |--- arrival_month <= 7.50 | | | | | | | | | | | |--- truncated branch of depth 16 | | | | | | | | | | |--- arrival_month > 7.50 | | | | | | | | | | | |--- truncated branch of depth 12 | | | | | | | | | |--- lead_time > 59.50 | | | | | | | | | | |--- arrival_date <= 19.50 | | | | | | | | | | | |--- weights: [22.00, 0.00] class: 0.0 | | | | | | | | | | |--- arrival_date > 19.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | |--- arrival_month > 9.50 | | | | | | | | | |--- market_segment_type_Aviation <= 0.50 | | | | | | | | | | |--- lead_time <= 65.50 | | | | | | | | | | | |--- truncated branch of depth 13 | | | | | | | | | | |--- lead_time > 65.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | |--- market_segment_type_Aviation > 0.50 | | | | | | | | | | |--- arrival_date <= 7.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- arrival_date > 7.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | |--- no_of_weekend_nights > 4.50 | | | | | | | |--- weights: [0.00, 8.00] class: 1.0 | | | | | |--- lead_time > 68.50 | | | | | | |--- avg_price_per_room <= 99.98 | | | | | | | |--- arrival_month <= 3.50 | | | | | | | | |--- avg_price_per_room <= 62.50 | | | | | | | | | |--- weights: [21.00, 0.00] class: 0.0 | | | | | | | | |--- avg_price_per_room > 62.50 | | | | | | | | | |--- lead_time <= 77.00 | | | | | | | | | | |--- room_type_reserved_Room_Type 1 <= 0.50 | | | | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | | | | | | |--- room_type_reserved_Room_Type 1 > 0.50 | | | | | | | | | | | |--- weights: [0.00, 9.00] class: 1.0 | | | | | | | | | |--- lead_time > 77.00 | | | | | | | | | | |--- no_of_weekend_nights <= 1.50 | | | | | | | | | | | |--- weights: [9.00, 0.00] class: 0.0 | | | | | | | | | | |--- no_of_weekend_nights > 1.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | |--- arrival_month > 3.50 | | | | | | | | |--- lead_time <= 71.50 | | | | | | | | | |--- no_of_week_nights <= 3.50 | | | | | | | | | | |--- weights: [3.00, 0.00] class: 0.0 | | | | | | | | | |--- no_of_week_nights > 3.50 | | | | | | | | | | |--- weights: [0.00, 2.00] class: 1.0 | | | | | | | | |--- lead_time > 71.50 | | | | | | | | | |--- no_of_week_nights <= 2.50 | | | | | | | | | | |--- lead_time <= 88.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- lead_time > 88.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- no_of_week_nights > 2.50 | | | | | | | | | | |--- lead_time <= 73.50 | | | | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | | | | | |--- lead_time > 73.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | |--- avg_price_per_room > 99.98 | | | | | | | |--- no_of_adults <= 1.50 | | | | | | | | |--- arrival_date <= 17.00 | | | | | | | | | |--- weights: [0.00, 52.00] class: 1.0 | | | | | | | | |--- arrival_date > 17.00 | | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | | | |--- no_of_adults > 1.50 | | | | | | | | |--- avg_price_per_room <= 105.20 | | | | | | | | | |--- arrival_date <= 22.00 | | | | | | | | | | |--- lead_time <= 75.00 | | | | | | | | | | | |--- weights: [2.00, 0.00] class: 0.0 | | | | | | | | | | |--- lead_time > 75.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- arrival_date > 22.00 | | | | | | | | | | |--- lead_time <= 78.50 | | | | | | | | | | | |--- weights: [0.00, 22.00] class: 1.0 | | | | | | | | | | |--- lead_time > 78.50 | | | | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | | | | |--- avg_price_per_room > 105.20 | | | | | | | | | |--- lead_time <= 88.50 | | | | | | | | | | |--- arrival_date <= 3.00 | | | | | | | | | | | |--- weights: [0.00, 2.00] class: 1.0 | | | | | | | | | | |--- arrival_date > 3.00 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | |--- lead_time > 88.50 | | | | | | | | | | |--- weights: [0.00, 2.00] class: 1.0 | | | |--- lead_time > 90.50 | | | | |--- lead_time <= 117.50 | | | | | |--- avg_price_per_room <= 93.58 | | | | | | |--- arrival_date <= 6.50 | | | | | | | |--- no_of_week_nights <= 2.50 | | | | | | | | |--- no_of_weekend_nights <= 1.50 | | | | | | | | | |--- avg_price_per_room <= 80.38 | | | | | | | | | | |--- type_of_meal_plan <= 0.00 | | | | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | | | | | | |--- type_of_meal_plan > 0.00 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | |--- avg_price_per_room > 80.38 | | | | | | | | | | |--- room_type_reserved_Room_Type 1 <= 0.50 | | | | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | | | | | |--- room_type_reserved_Room_Type 1 > 0.50 | | | | | | | | | | | |--- weights: [3.00, 0.00] class: 0.0 | | | | | | | | |--- no_of_weekend_nights > 1.50 | | | | | | | | | |--- arrival_date <= 5.00 | | | | | | | | | | |--- weights: [5.00, 0.00] class: 0.0 | | | | | | | | | |--- arrival_date > 5.00 | | | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | | |--- no_of_week_nights > 2.50 | | | | | | | | |--- arrival_date <= 5.50 | | | | | | | | | |--- weights: [35.00, 0.00] class: 0.0 | | | | | | | | |--- arrival_date > 5.50 | | | | | | | | | |--- avg_price_per_room <= 75.22 | | | | | | | | | | |--- weights: [3.00, 0.00] class: 0.0 | | | | | | | | | |--- avg_price_per_room > 75.22 | | | | | | | | | | |--- type_of_meal_plan <= 0.00 | | | | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | | | | | | |--- type_of_meal_plan > 0.00 | | | | | | | | | | | |--- weights: [0.00, 2.00] class: 1.0 | | | | | | |--- arrival_date > 6.50 | | | | | | | |--- avg_price_per_room <= 66.50 | | | | | | | | |--- no_of_weekend_nights <= 1.50 | | | | | | | | | |--- arrival_date <= 9.50 | | | | | | | | | | |--- no_of_week_nights <= 3.00 | | | | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | | | | | |--- no_of_week_nights > 3.00 | | | | | | | | | | | |--- weights: [2.00, 0.00] class: 0.0 | | | | | | | | | |--- arrival_date > 9.50 | | | | | | | | | | |--- weights: [24.00, 0.00] class: 0.0 | | | | | | | | |--- no_of_weekend_nights > 1.50 | | | | | | | | | |--- avg_price_per_room <= 58.75 | | | | | | | | | | |--- weights: [6.00, 0.00] class: 0.0 | | | | | | | | | |--- avg_price_per_room > 58.75 | | | | | | | | | | |--- lead_time <= 97.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- lead_time > 97.50 | | | | | | | | | | | |--- weights: [0.00, 39.00] class: 1.0 | | | | | | | |--- avg_price_per_room > 66.50 | | | | | | | | |--- type_of_meal_plan <= 1.50 | | | | | | | | | |--- arrival_date <= 29.50 | | | | | | | | | | |--- type_of_meal_plan <= 0.00 | | | | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | | | | | |--- type_of_meal_plan > 0.00 | | | | | | | | | | | |--- truncated branch of depth 11 | | | | | | | | | |--- arrival_date > 29.50 | | | | | | | | | | |--- lead_time <= 96.00 | | | | | | | | | | | |--- weights: [8.00, 0.00] class: 0.0 | | | | | | | | | | |--- lead_time > 96.00 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | |--- type_of_meal_plan > 1.50 | | | | | | | | | |--- avg_price_per_room <= 82.50 | | | | | | | | | | |--- lead_time <= 102.50 | | | | | | | | | | | |--- weights: [0.00, 7.00] class: 1.0 | | | | | | | | | | |--- lead_time > 102.50 | | | | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | | | | | |--- avg_price_per_room > 82.50 | | | | | | | | | | |--- lead_time <= 99.00 | | | | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | | | | | | |--- lead_time > 99.00 | | | | | | | | | | | |--- weights: [11.00, 2.00] class: 0.0 | | | | | |--- avg_price_per_room > 93.58 | | | | | | |--- arrival_date <= 16.50 | | | | | | | |--- arrival_month <= 7.50 | | | | | | | | |--- lead_time <= 108.50 | | | | | | | | | |--- lead_time <= 107.50 | | | | | | | | | | |--- avg_price_per_room <= 125.00 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- avg_price_per_room > 125.00 | | | | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | | | | |--- lead_time > 107.50 | | | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | | | |--- lead_time > 108.50 | | | | | | | | | |--- lead_time <= 111.50 | | | | | | | | | | |--- weights: [5.00, 0.00] class: 0.0 | | | | | | | | | |--- lead_time > 111.50 | | | | | | | | | | |--- no_of_adults <= 1.50 | | | | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | | | | | | |--- no_of_adults > 1.50 | | | | | | | | | | | |--- weights: [12.00, 1.00] class: 0.0 | | | | | | | |--- arrival_month > 7.50 | | | | | | | | |--- avg_price_per_room <= 108.50 | | | | | | | | | |--- arrival_date <= 14.50 | | | | | | | | | | |--- lead_time <= 113.50 | | | | | | | | | | | |--- weights: [4.00, 0.00] class: 0.0 | | | | | | | | | | |--- lead_time > 113.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- arrival_date > 14.50 | | | | | | | | | | |--- weights: [0.00, 47.00] class: 1.0 | | | | | | | | |--- avg_price_per_room > 108.50 | | | | | | | | | |--- no_of_weekend_nights <= 0.50 | | | | | | | | | | |--- weights: [42.00, 0.00] class: 0.0 | | | | | | | | | |--- no_of_weekend_nights > 0.50 | | | | | | | | | | |--- arrival_date <= 9.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- arrival_date > 9.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | |--- arrival_date > 16.50 | | | | | | | |--- arrival_month <= 8.50 | | | | | | | | |--- avg_price_per_room <= 127.39 | | | | | | | | | |--- no_of_adults <= 1.50 | | | | | | | | | | |--- weights: [0.00, 50.00] class: 1.0 | | | | | | | | | |--- no_of_adults > 1.50 | | | | | | | | | | |--- no_of_week_nights <= 2.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- no_of_week_nights > 2.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | |--- avg_price_per_room > 127.39 | | | | | | | | | |--- weights: [2.00, 0.00] class: 0.0 | | | | | | | |--- arrival_month > 8.50 | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | |--- weights: [0.00, 3.00] class: 1.0 | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | |--- avg_price_per_room <= 101.34 | | | | | | | | | | |--- weights: [0.00, 3.00] class: 1.0 | | | | | | | | | |--- avg_price_per_room > 101.34 | | | | | | | | | | |--- avg_price_per_room <= 165.11 | | | | | | | | | | | |--- weights: [8.00, 0.00] class: 0.0 | | | | | | | | | | |--- avg_price_per_room > 165.11 | | | | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | |--- lead_time > 117.50 | | | | | |--- no_of_week_nights <= 1.50 | | | | | | |--- arrival_date <= 7.50 | | | | | | | |--- weights: [51.00, 0.00] class: 0.0 | | | | | | |--- arrival_date > 7.50 | | | | | | | |--- avg_price_per_room <= 93.58 | | | | | | | | |--- avg_price_per_room <= 65.38 | | | | | | | | | |--- weights: [0.00, 3.00] class: 1.0 | | | | | | | | |--- avg_price_per_room > 65.38 | | | | | | | | | |--- avg_price_per_room <= 89.88 | | | | | | | | | | |--- weights: [24.00, 0.00] class: 0.0 | | | | | | | | | |--- avg_price_per_room > 89.88 | | | | | | | | | | |--- arrival_month <= 10.00 | | | | | | | | | | | |--- weights: [8.00, 2.00] class: 0.0 | | | | | | | | | | |--- arrival_month > 10.00 | | | | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | | | |--- avg_price_per_room > 93.58 | | | | | | | | |--- arrival_date <= 28.00 | | | | | | | | | |--- type_of_meal_plan <= 1.50 | | | | | | | | | | |--- no_of_adults <= 2.50 | | | | | | | | | | | |--- weights: [0.00, 17.00] class: 1.0 | | | | | | | | | | |--- no_of_adults > 2.50 | | | | | | | | | | | |--- weights: [1.00, 1.00] class: 0.0 | | | | | | | | | |--- type_of_meal_plan > 1.50 | | | | | | | | | | |--- avg_price_per_room <= 118.38 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- avg_price_per_room > 118.38 | | | | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | | | | |--- arrival_date > 28.00 | | | | | | | | | |--- weights: [13.00, 1.00] class: 0.0 | | | | | |--- no_of_week_nights > 1.50 | | | | | | |--- no_of_adults <= 1.50 | | | | | | | |--- weights: [113.00, 0.00] class: 0.0 | | | | | | |--- no_of_adults > 1.50 | | | | | | | |--- lead_time <= 125.50 | | | | | | | | |--- avg_price_per_room <= 90.85 | | | | | | | | | |--- avg_price_per_room <= 87.50 | | | | | | | | | | |--- no_of_week_nights <= 3.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- no_of_week_nights > 3.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- avg_price_per_room > 87.50 | | | | | | | | | | |--- weights: [0.00, 10.00] class: 1.0 | | | | | | | | |--- avg_price_per_room > 90.85 | | | | | | | | | |--- weights: [14.00, 0.00] class: 0.0 | | | | | | | |--- lead_time > 125.50 | | | | | | | | |--- avg_price_per_room <= 155.78 | | | | | | | | | |--- arrival_date <= 19.50 | | | | | | | | | | |--- arrival_date <= 10.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- arrival_date > 10.50 | | | | | | | | | | | |--- truncated branch of depth 9 | | | | | | | | | |--- arrival_date > 19.50 | | | | | | | | | | |--- lead_time <= 128.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- lead_time > 128.00 | | | | | | | | | | | |--- weights: [75.00, 0.00] class: 0.0 | | | | | | | | |--- avg_price_per_room > 155.78 | | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | |--- market_segment_type_Online > 0.50 | | | |--- lead_time <= 13.50 | | | | |--- avg_price_per_room <= 119.42 | | | | | |--- arrival_month <= 8.50 | | | | | | |--- arrival_month <= 1.50 | | | | | | | |--- weights: [128.00, 0.00] class: 0.0 | | | | | | |--- arrival_month > 1.50 | | | | | | | |--- lead_time <= 3.50 | | | | | | | | |--- no_of_weekend_nights <= 1.50 | | | | | | | | | |--- avg_price_per_room <= 106.50 | | | | | | | | | | |--- avg_price_per_room <= 74.57 | | | | | | | | | | | |--- weights: [37.00, 0.00] class: 0.0 | | | | | | | | | | |--- avg_price_per_room > 74.57 | | | | | | | | | | | |--- truncated branch of depth 15 | | | | | | | | | |--- avg_price_per_room > 106.50 | | | | | | | | | | |--- weights: [38.00, 0.00] class: 0.0 | | | | | | | | |--- no_of_weekend_nights > 1.50 | | | | | | | | | |--- arrival_date <= 27.00 | | | | | | | | | | |--- avg_price_per_room <= 75.46 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- avg_price_per_room > 75.46 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | |--- arrival_date > 27.00 | | | | | | | | | | |--- arrival_month <= 6.50 | | | | | | | | | | | |--- weights: [0.00, 12.00] class: 1.0 | | | | | | | | | | |--- arrival_month > 6.50 | | | | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | | | |--- lead_time > 3.50 | | | | | | | | |--- avg_price_per_room <= 99.38 | | | | | | | | | |--- arrival_date <= 3.50 | | | | | | | | | | |--- weights: [14.00, 0.00] class: 0.0 | | | | | | | | | |--- arrival_date > 3.50 | | | | | | | | | | |--- lead_time <= 12.50 | | | | | | | | | | | |--- truncated branch of depth 12 | | | | | | | | | | |--- lead_time > 12.50 | | | | | | | | | | | |--- weights: [9.00, 0.00] class: 0.0 | | | | | | | | |--- avg_price_per_room > 99.38 | | | | | | | | | |--- avg_price_per_room <= 117.25 | | | | | | | | | | |--- avg_price_per_room <= 101.67 | | | | | | | | | | | |--- weights: [0.00, 7.00] class: 1.0 | | | | | | | | | | |--- avg_price_per_room > 101.67 | | | | | | | | | | | |--- truncated branch of depth 7 | | | | | | | | | |--- avg_price_per_room > 117.25 | | | | | | | | | | |--- arrival_month <= 3.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- arrival_month > 3.50 | | | | | | | | | | | |--- weights: [11.00, 0.00] class: 0.0 | | | | | |--- arrival_month > 8.50 | | | | | | |--- no_of_week_nights <= 4.50 | | | | | | | |--- avg_price_per_room <= 117.56 | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | |--- weights: [148.00, 0.00] class: 0.0 | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | | |--- lead_time <= 8.50 | | | | | | | | | | | |--- truncated branch of depth 7 | | | | | | | | | | |--- lead_time > 8.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | | |--- weights: [69.00, 0.00] class: 0.0 | | | | | | | |--- avg_price_per_room > 117.56 | | | | | | | | |--- type_of_meal_plan <= 0.00 | | | | | | | | | |--- lead_time <= 10.00 | | | | | | | | | | |--- arrival_date <= 25.50 | | | | | | | | | | | |--- weights: [2.00, 0.00] class: 0.0 | | | | | | | | | | |--- arrival_date > 25.50 | | | | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | | | | |--- lead_time > 10.00 | | | | | | | | | | |--- weights: [0.00, 2.00] class: 1.0 | | | | | | | | |--- type_of_meal_plan > 0.00 | | | | | | | | | |--- weights: [4.00, 0.00] class: 0.0 | | | | | | |--- no_of_week_nights > 4.50 | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | |--- weights: [0.00, 5.00] class: 1.0 | | | | | | | |--- arrival_month > 11.50 | | | | | | | | |--- weights: [6.00, 0.00] class: 0.0 | | | | |--- avg_price_per_room > 119.42 | | | | | |--- lead_time <= 3.50 | | | | | | |--- avg_price_per_room <= 178.78 | | | | | | | |--- no_of_week_nights <= 4.50 | | | | | | | | |--- arrival_month <= 8.50 | | | | | | | | | |--- avg_price_per_room <= 134.50 | | | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | |--- avg_price_per_room > 134.50 | | | | | | | | | | |--- avg_price_per_room <= 136.09 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- avg_price_per_room > 136.09 | | | | | | | | | | | |--- truncated branch of depth 9 | | | | | | | | |--- arrival_month > 8.50 | | | | | | | | | |--- avg_price_per_room <= 169.67 | | | | | | | | | | |--- arrival_date <= 5.00 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- arrival_date > 5.00 | | | | | | | | | | | |--- weights: [53.00, 0.00] class: 0.0 | | | | | | | | | |--- avg_price_per_room > 169.67 | | | | | | | | | | |--- avg_price_per_room <= 172.00 | | | | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | | | | | |--- avg_price_per_room > 172.00 | | | | | | | | | | | |--- weights: [2.00, 0.00] class: 0.0 | | | | | | | |--- no_of_week_nights > 4.50 | | | | | | | | |--- weights: [0.00, 3.00] class: 1.0 | | | | | | |--- avg_price_per_room > 178.78 | | | | | | | |--- arrival_date <= 15.50 | | | | | | | | |--- arrival_date <= 10.50 | | | | | | | | | |--- no_of_weekend_nights <= 0.50 | | | | | | | | | | |--- arrival_date <= 6.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- arrival_date > 6.50 | | | | | | | | | | | |--- weights: [6.00, 0.00] class: 0.0 | | | | | | | | | |--- no_of_weekend_nights > 0.50 | | | | | | | | | | |--- weights: [0.00, 2.00] class: 1.0 | | | | | | | | |--- arrival_date > 10.50 | | | | | | | | | |--- weights: [0.00, 7.00] class: 1.0 | | | | | | | |--- arrival_date > 15.50 | | | | | | | | |--- no_of_week_nights <= 3.50 | | | | | | | | | |--- arrival_date <= 24.50 | | | | | | | | | | |--- arrival_date <= 22.00 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- arrival_date > 22.00 | | | | | | | | | | | |--- weights: [0.00, 2.00] class: 1.0 | | | | | | | | | |--- arrival_date > 24.50 | | | | | | | | | | |--- weights: [6.00, 0.00] class: 0.0 | | | | | | | | |--- no_of_week_nights > 3.50 | | | | | | | | | |--- weights: [0.00, 2.00] class: 1.0 | | | | | |--- lead_time > 3.50 | | | | | | |--- arrival_month <= 8.50 | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | |--- lead_time <= 6.50 | | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | | | |--- lead_time > 6.50 | | | | | | | | | |--- weights: [4.00, 0.00] class: 0.0 | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | |--- no_of_adults <= 2.50 | | | | | | | | | |--- avg_price_per_room <= 160.50 | | | | | | | | | | |--- arrival_month <= 6.50 | | | | | | | | | | | |--- truncated branch of depth 10 | | | | | | | | | | |--- arrival_month > 6.50 | | | | | | | | | | | |--- truncated branch of depth 10 | | | | | | | | | |--- avg_price_per_room > 160.50 | | | | | | | | | | |--- arrival_month <= 2.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- arrival_month > 2.50 | | | | | | | | | | | |--- weights: [0.00, 25.00] class: 1.0 | | | | | | | | |--- no_of_adults > 2.50 | | | | | | | | | |--- lead_time <= 5.50 | | | | | | | | | | |--- arrival_date <= 15.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- arrival_date > 15.00 | | | | | | | | | | | |--- weights: [4.00, 0.00] class: 0.0 | | | | | | | | | |--- lead_time > 5.50 | | | | | | | | | | |--- no_of_weekend_nights <= 1.50 | | | | | | | | | | | |--- truncated branch of depth 7 | | | | | | | | | | |--- no_of_weekend_nights > 1.50 | | | | | | | | | | | |--- weights: [0.00, 6.00] class: 1.0 | | | | | | |--- arrival_month > 8.50 | | | | | | | |--- arrival_date <= 14.00 | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | | |--- lead_time <= 9.50 | | | | | | | | | | | |--- weights: [5.00, 0.00] class: 0.0 | | | | | | | | | | |--- lead_time > 9.50 | | | | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | | |--- arrival_date <= 2.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- arrival_date > 2.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | |--- weights: [7.00, 0.00] class: 0.0 | | | | | | | |--- arrival_date > 14.00 | | | | | | | | |--- avg_price_per_room <= 161.17 | | | | | | | | | |--- lead_time <= 12.50 | | | | | | | | | | |--- weights: [30.00, 0.00] class: 0.0 | | | | | | | | | |--- lead_time > 12.50 | | | | | | | | | | |--- no_of_week_nights <= 1.00 | | | | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | | | | | |--- no_of_week_nights > 1.00 | | | | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | | | | |--- avg_price_per_room > 161.17 | | | | | | | | | |--- avg_price_per_room <= 166.50 | | | | | | | | | | |--- weights: [0.00, 2.00] class: 1.0 | | | | | | | | | |--- avg_price_per_room > 166.50 | | | | | | | | | | |--- no_of_children <= 1.00 | | | | | | | | | | | |--- weights: [6.00, 0.00] class: 0.0 | | | | | | | | | | |--- no_of_children > 1.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | |--- lead_time > 13.50 | | | | |--- avg_price_per_room <= 105.27 | | | | | |--- avg_price_per_room <= 60.07 | | | | | | |--- lead_time <= 84.50 | | | | | | | |--- lead_time <= 51.50 | | | | | | | | |--- lead_time <= 50.50 | | | | | | | | | |--- avg_price_per_room <= 29.04 | | | | | | | | | | |--- weights: [19.00, 0.00] class: 0.0 | | | | | | | | | |--- avg_price_per_room > 29.04 | | | | | | | | | | |--- avg_price_per_room <= 49.84 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- avg_price_per_room > 49.84 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | |--- lead_time > 50.50 | | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | | |--- lead_time > 51.50 | | | | | | | | |--- weights: [32.00, 0.00] class: 0.0 | | | | | | |--- lead_time > 84.50 | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | |--- arrival_date <= 19.00 | | | | | | | | | |--- lead_time <= 139.00 | | | | | | | | | | |--- weights: [0.00, 8.00] class: 1.0 | | | | | | | | | |--- lead_time > 139.00 | | | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | | | | |--- arrival_date > 19.00 | | | | | | | | | |--- lead_time <= 87.50 | | | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | | | | |--- lead_time > 87.50 | | | | | | | | | | |--- no_of_weekend_nights <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- no_of_weekend_nights > 0.50 | | | | | | | | | | | |--- weights: [6.00, 0.00] class: 0.0 | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | |--- avg_price_per_room <= 59.43 | | | | | | | | | |--- weights: [14.00, 0.00] class: 0.0 | | | | | | | | |--- avg_price_per_room > 59.43 | | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | |--- avg_price_per_room > 60.07 | | | | | | |--- lead_time <= 25.50 | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | |--- arrival_month <= 1.50 | | | | | | | | | |--- weights: [29.00, 0.00] class: 0.0 | | | | | | | | |--- arrival_month > 1.50 | | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | | |--- lead_time <= 14.50 | | | | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | | | | | |--- lead_time > 14.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | | |--- no_of_week_nights <= 3.50 | | | | | | | | | | | |--- truncated branch of depth 14 | | | | | | | | | | |--- no_of_week_nights > 3.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | |--- arrival_month > 11.50 | | | | | | | | |--- weights: [54.00, 0.00] class: 0.0 | | | | | | |--- lead_time > 25.50 | | | | | | | |--- type_of_meal_plan <= 0.00 | | | | | | | | |--- required_car_parking_space <= 0.50 | | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | | |--- no_of_week_nights <= 5.00 | | | | | | | | | | | |--- weights: [6.00, 0.00] class: 0.0 | | | | | | | | | | |--- no_of_week_nights > 5.00 | | | | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | | |--- no_of_adults <= 1.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- no_of_adults > 1.50 | | | | | | | | | | | |--- truncated branch of depth 15 | | | | | | | | |--- required_car_parking_space > 0.50 | | | | | | | | | |--- weights: [6.00, 0.00] class: 0.0 | | | | | | | |--- type_of_meal_plan > 0.00 | | | | | | | | |--- type_of_meal_plan <= 1.50 | | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | | |--- lead_time <= 60.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | | |--- lead_time > 60.50 | | | | | | | | | | | |--- truncated branch of depth 10 | | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | | |--- required_car_parking_space <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 28 | | | | | | | | | | |--- required_car_parking_space > 0.50 | | | | | | | | | | | |--- weights: [12.00, 0.00] class: 0.0 | | | | | | | | |--- type_of_meal_plan > 1.50 | | | | | | | | | |--- arrival_month <= 5.00 | | | | | | | | | | |--- weights: [2.00, 0.00] class: 0.0 | | | | | | | | | |--- arrival_month > 5.00 | | | | | | | | | | |--- lead_time <= 57.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- lead_time > 57.50 | | | | | | | | | | | |--- weights: [0.00, 35.00] class: 1.0 | | | | |--- avg_price_per_room > 105.27 | | | | | |--- required_car_parking_space <= 0.50 | | | | | | |--- arrival_month <= 10.50 | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | |--- type_of_meal_plan <= 1.50 | | | | | | | | | |--- arrival_month <= 9.50 | | | | | | | | | | |--- no_of_week_nights <= 3.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- no_of_week_nights > 3.50 | | | | | | | | | | | |--- weights: [0.00, 2.00] class: 1.0 | | | | | | | | | |--- arrival_month > 9.50 | | | | | | | | | | |--- weights: [12.00, 0.00] class: 0.0 | | | | | | | | |--- type_of_meal_plan > 1.50 | | | | | | | | | |--- weights: [0.00, 13.00] class: 1.0 | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | |--- room_type_reserved_Room_Type 5 <= 0.50 | | | | | | | | | |--- avg_price_per_room <= 171.22 | | | | | | | | | | |--- arrival_date <= 24.50 | | | | | | | | | | | |--- truncated branch of depth 23 | | | | | | | | | | |--- arrival_date > 24.50 | | | | | | | | | | | |--- truncated branch of depth 16 | | | | | | | | | |--- avg_price_per_room > 171.22 | | | | | | | | | | |--- no_of_children <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 8 | | | | | | | | | | |--- no_of_children > 0.50 | | | | | | | | | | | |--- truncated branch of depth 9 | | | | | | | | |--- room_type_reserved_Room_Type 5 > 0.50 | | | | | | | | | |--- arrival_date <= 26.50 | | | | | | | | | | |--- avg_price_per_room <= 175.71 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- avg_price_per_room > 175.71 | | | | | | | | | | | |--- weights: [0.00, 2.00] class: 1.0 | | | | | | | | | |--- arrival_date > 26.50 | | | | | | | | | | |--- weights: [0.00, 5.00] class: 1.0 | | | | | | |--- arrival_month > 10.50 | | | | | | | |--- lead_time <= 22.50 | | | | | | | | |--- no_of_adults <= 1.50 | | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | | |--- weights: [0.00, 4.00] class: 1.0 | | | | | | | | |--- no_of_adults > 1.50 | | | | | | | | | |--- weights: [22.00, 0.00] class: 0.0 | | | | | | | |--- lead_time > 22.50 | | | | | | | | |--- avg_price_per_room <= 168.06 | | | | | | | | | |--- avg_price_per_room <= 147.75 | | | | | | | | | | |--- no_of_week_nights <= 3.50 | | | | | | | | | | | |--- truncated branch of depth 8 | | | | | | | | | | |--- no_of_week_nights > 3.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | |--- avg_price_per_room > 147.75 | | | | | | | | | | |--- weights: [0.00, 15.00] class: 1.0 | | | | | | | | |--- avg_price_per_room > 168.06 | | | | | | | | | |--- no_of_week_nights <= 5.00 | | | | | | | | | | |--- lead_time <= 80.00 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- lead_time > 80.00 | | | | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | | | | |--- no_of_week_nights > 5.00 | | | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | |--- required_car_parking_space > 0.50 | | | | | | |--- no_of_weekend_nights <= 3.00 | | | | | | | |--- weights: [39.00, 0.00] class: 0.0 | | | | | | |--- no_of_weekend_nights > 3.00 | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | |--- no_of_special_requests > 0.50 | | |--- no_of_special_requests <= 1.50 | | | |--- market_segment_type_Online <= 0.50 | | | | |--- type_of_meal_plan <= 0.00 | | | | | |--- lead_time <= 63.00 | | | | | | |--- market_segment_type_Corporate <= 0.50 | | | | | | | |--- weights: [18.00, 0.00] class: 0.0 | | | | | | |--- market_segment_type_Corporate > 0.50 | | | | | | | |--- lead_time <= 12.50 | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | | | |--- lead_time > 12.50 | | | | | | | | |--- weights: [2.00, 1.00] class: 0.0 | | | | | |--- lead_time > 63.00 | | | | | | |--- weights: [0.00, 6.00] class: 1.0 | | | | |--- type_of_meal_plan > 0.00 | | | | | |--- lead_time <= 102.50 | | | | | | |--- no_of_weekend_nights <= 3.50 | | | | | | | |--- room_type_reserved_Room_Type 5 <= 0.50 | | | | | | | | |--- lead_time <= 91.50 | | | | | | | | | |--- avg_price_per_room <= 129.50 | | | | | | | | | | |--- weights: [848.00, 0.00] class: 0.0 | | | | | | | | | |--- avg_price_per_room > 129.50 | | | | | | | | | | |--- avg_price_per_room <= 131.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- avg_price_per_room > 131.50 | | | | | | | | | | | |--- weights: [27.00, 0.00] class: 0.0 | | | | | | | | |--- lead_time > 91.50 | | | | | | | | | |--- no_of_children <= 0.50 | | | | | | | | | | |--- weights: [43.00, 0.00] class: 0.0 | | | | | | | | | |--- no_of_children > 0.50 | | | | | | | | | | |--- lead_time <= 95.50 | | | | | | | | | | | |--- weights: [0.00, 2.00] class: 1.0 | | | | | | | | | | |--- lead_time > 95.50 | | | | | | | | | | | |--- weights: [2.00, 0.00] class: 0.0 | | | | | | | |--- room_type_reserved_Room_Type 5 > 0.50 | | | | | | | | |--- avg_price_per_room <= 164.79 | | | | | | | | | |--- no_of_weekend_nights <= 1.50 | | | | | | | | | | |--- weights: [11.00, 0.00] class: 0.0 | | | | | | | | | |--- no_of_weekend_nights > 1.50 | | | | | | | | | | |--- market_segment_type_Corporate <= 0.50 | | | | | | | | | | | |--- weights: [2.00, 0.00] class: 0.0 | | | | | | | | | | |--- market_segment_type_Corporate > 0.50 | | | | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | | | |--- avg_price_per_room > 164.79 | | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | |--- no_of_weekend_nights > 3.50 | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | |--- lead_time > 102.50 | | | | | | |--- lead_time <= 104.50 | | | | | | | |--- no_of_week_nights <= 2.50 | | | | | | | | |--- avg_price_per_room <= 67.65 | | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | | | | |--- avg_price_per_room > 67.65 | | | | | | | | | |--- weights: [0.00, 4.00] class: 1.0 | | | | | | | |--- no_of_week_nights > 2.50 | | | | | | | | |--- weights: [4.00, 0.00] class: 0.0 | | | | | | |--- lead_time > 104.50 | | | | | | | |--- avg_price_per_room <= 141.75 | | | | | | | | |--- no_of_week_nights <= 2.50 | | | | | | | | | |--- avg_price_per_room <= 83.39 | | | | | | | | | | |--- arrival_month <= 3.50 | | | | | | | | | | | |--- weights: [6.00, 0.00] class: 0.0 | | | | | | | | | | |--- arrival_month > 3.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | |--- avg_price_per_room > 83.39 | | | | | | | | | | |--- lead_time <= 143.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- lead_time > 143.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | |--- no_of_week_nights > 2.50 | | | | | | | | | |--- avg_price_per_room <= 122.00 | | | | | | | | | | |--- weights: [54.00, 0.00] class: 0.0 | | | | | | | | | |--- avg_price_per_room > 122.00 | | | | | | | | | | |--- room_type_reserved_Room_Type 7 <= 0.50 | | | | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | | | | | |--- room_type_reserved_Room_Type 7 > 0.50 | | | | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | | | |--- avg_price_per_room > 141.75 | | | | | | | | |--- room_type_reserved_Room_Type 4 <= 0.50 | | | | | | | | | |--- weights: [0.00, 2.00] class: 1.0 | | | | | | | | |--- room_type_reserved_Room_Type 4 > 0.50 | | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | |--- market_segment_type_Online > 0.50 | | | | |--- lead_time <= 8.50 | | | | | |--- lead_time <= 4.50 | | | | | | |--- no_of_weekend_nights <= 3.50 | | | | | | | |--- avg_price_per_room <= 157.64 | | | | | | | | |--- room_type_reserved_Room_Type 2 <= 0.50 | | | | | | | | | |--- arrival_date <= 4.50 | | | | | | | | | | |--- weights: [81.00, 0.00] class: 0.0 | | | | | | | | | |--- arrival_date > 4.50 | | | | | | | | | | |--- arrival_date <= 27.50 | | | | | | | | | | | |--- truncated branch of depth 15 | | | | | | | | | | |--- arrival_date > 27.50 | | | | | | | | | | | |--- weights: [69.00, 0.00] class: 0.0 | | | | | | | | |--- room_type_reserved_Room_Type 2 > 0.50 | | | | | | | | | |--- no_of_week_nights <= 2.50 | | | | | | | | | | |--- weights: [5.00, 0.00] class: 0.0 | | | | | | | | | |--- no_of_week_nights > 2.50 | | | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | | |--- avg_price_per_room > 157.64 | | | | | | | | |--- avg_price_per_room <= 158.50 | | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | | | |--- avg_price_per_room > 158.50 | | | | | | | | | |--- room_type_reserved_Room_Type 4 <= 0.50 | | | | | | | | | | |--- lead_time <= 3.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- lead_time > 3.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | |--- room_type_reserved_Room_Type 4 > 0.50 | | | | | | | | | | |--- arrival_date <= 17.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- arrival_date > 17.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | |--- no_of_weekend_nights > 3.50 | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | |--- weights: [0.00, 2.00] class: 1.0 | | | | | |--- lead_time > 4.50 | | | | | | |--- room_type_reserved_Room_Type 2 <= 0.50 | | | | | | | |--- avg_price_per_room <= 123.60 | | | | | | | | |--- arrival_month <= 8.50 | | | | | | | | | |--- arrival_date <= 13.50 | | | | | | | | | | |--- arrival_month <= 4.50 | | | | | | | | | | | |--- truncated branch of depth 8 | | | | | | | | | | |--- arrival_month > 4.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | |--- arrival_date > 13.50 | | | | | | | | | | |--- no_of_weekend_nights <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | | |--- no_of_weekend_nights > 0.50 | | | | | | | | | | | |--- weights: [37.00, 0.00] class: 0.0 | | | | | | | | |--- arrival_month > 8.50 | | | | | | | | | |--- weights: [95.00, 0.00] class: 0.0 | | | | | | | |--- avg_price_per_room > 123.60 | | | | | | | | |--- type_of_meal_plan <= 0.00 | | | | | | | | | |--- arrival_month <= 9.50 | | | | | | | | | | |--- no_of_week_nights <= 1.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- no_of_week_nights > 1.50 | | | | | | | | | | | |--- weights: [11.00, 0.00] class: 0.0 | | | | | | | | | |--- arrival_month > 9.50 | | | | | | | | | | |--- lead_time <= 6.50 | | | | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | | | | | | |--- lead_time > 6.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | |--- type_of_meal_plan > 0.00 | | | | | | | | | |--- arrival_date <= 15.50 | | | | | | | | | | |--- avg_price_per_room <= 128.91 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- avg_price_per_room > 128.91 | | | | | | | | | | | |--- truncated branch of depth 10 | | | | | | | | | |--- arrival_date > 15.50 | | | | | | | | | | |--- lead_time <= 6.50 | | | | | | | | | | | |--- weights: [42.00, 0.00] class: 0.0 | | | | | | | | | | |--- lead_time > 6.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | |--- room_type_reserved_Room_Type 2 > 0.50 | | | | | | | |--- no_of_weekend_nights <= 1.50 | | | | | | | | |--- weights: [2.00, 0.00] class: 0.0 | | | | | | | |--- no_of_weekend_nights > 1.50 | | | | | | | | |--- weights: [0.00, 3.00] class: 1.0 | | | | |--- lead_time > 8.50 | | | | | |--- required_car_parking_space <= 0.50 | | | | | | |--- avg_price_per_room <= 127.62 | | | | | | | |--- no_of_weekend_nights <= 2.50 | | | | | | | | |--- lead_time <= 43.50 | | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | | |--- arrival_month <= 1.50 | | | | | | | | | | | |--- weights: [87.00, 0.00] class: 0.0 | | | | | | | | | | |--- arrival_month > 1.50 | | | | | | | | | | | |--- truncated branch of depth 23 | | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | | |--- weights: [128.00, 0.00] class: 0.0 | | | | | | | | |--- lead_time > 43.50 | | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | | |--- arrival_month <= 7.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- arrival_month > 7.50 | | | | | | | | | | | |--- truncated branch of depth 10 | | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | | |--- arrival_month <= 8.50 | | | | | | | | | | | |--- truncated branch of depth 21 | | | | | | | | | | |--- arrival_month > 8.50 | | | | | | | | | | | |--- truncated branch of depth 21 | | | | | | | |--- no_of_weekend_nights > 2.50 | | | | | | | | |--- arrival_month <= 1.50 | | | | | | | | | |--- weights: [3.00, 0.00] class: 0.0 | | | | | | | | |--- arrival_month > 1.50 | | | | | | | | | |--- avg_price_per_room <= 119.12 | | | | | | | | | | |--- arrival_date <= 10.50 | | | | | | | | | | | |--- weights: [0.00, 10.00] class: 1.0 | | | | | | | | | | |--- arrival_date > 10.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | |--- avg_price_per_room > 119.12 | | | | | | | | | | |--- weights: [3.00, 0.00] class: 0.0 | | | | | | |--- avg_price_per_room > 127.62 | | | | | | | |--- lead_time <= 142.50 | | | | | | | | |--- arrival_month <= 8.50 | | | | | | | | | |--- arrival_date <= 19.50 | | | | | | | | | | |--- avg_price_per_room <= 177.15 | | | | | | | | | | | |--- truncated branch of depth 15 | | | | | | | | | | |--- avg_price_per_room > 177.15 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | |--- arrival_date > 19.50 | | | | | | | | | | |--- arrival_date <= 27.50 | | | | | | | | | | | |--- truncated branch of depth 11 | | | | | | | | | | |--- arrival_date > 27.50 | | | | | | | | | | | |--- truncated branch of depth 10 | | | | | | | | |--- arrival_month > 8.50 | | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | | | |--- truncated branch of depth 7 | | | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | | | |--- truncated branch of depth 21 | | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | | |--- lead_time <= 100.50 | | | | | | | | | | | |--- weights: [49.00, 0.00] class: 0.0 | | | | | | | | | | |--- lead_time > 100.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | |--- lead_time > 142.50 | | | | | | | | |--- avg_price_per_room <= 142.65 | | | | | | | | | |--- no_of_week_nights <= 2.50 | | | | | | | | | | |--- weights: [5.00, 0.00] class: 0.0 | | | | | | | | | |--- no_of_week_nights > 2.50 | | | | | | | | | | |--- no_of_week_nights <= 3.50 | | | | | | | | | | | |--- weights: [0.00, 2.00] class: 1.0 | | | | | | | | | | |--- no_of_week_nights > 3.50 | | | | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | | | | |--- avg_price_per_room > 142.65 | | | | | | | | | |--- avg_price_per_room <= 179.11 | | | | | | | | | | |--- weights: [0.00, 11.00] class: 1.0 | | | | | | | | | |--- avg_price_per_room > 179.11 | | | | | | | | | | |--- lead_time <= 144.00 | | | | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | | | | | |--- lead_time > 144.00 | | | | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | |--- required_car_parking_space > 0.50 | | | | | | |--- room_type_reserved_Room_Type 7 <= 0.50 | | | | | | | |--- weights: [180.00, 0.00] class: 0.0 | | | | | | |--- room_type_reserved_Room_Type 7 > 0.50 | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | |--- no_of_special_requests > 1.50 | | | |--- lead_time <= 90.50 | | | | |--- no_of_week_nights <= 3.50 | | | | | |--- weights: [2126.00, 0.00] class: 0.0 | | | | |--- no_of_week_nights > 3.50 | | | | | |--- no_of_special_requests <= 2.50 | | | | | | |--- no_of_weekend_nights <= 4.50 | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | |--- avg_price_per_room <= 92.42 | | | | | | | | | |--- lead_time <= 54.00 | | | | | | | | | | |--- avg_price_per_room <= 91.90 | | | | | | | | | | | |--- truncated branch of depth 7 | | | | | | | | | | |--- avg_price_per_room > 91.90 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- lead_time > 54.00 | | | | | | | | | | |--- lead_time <= 78.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- lead_time > 78.50 | | | | | | | | | | | |--- weights: [4.00, 0.00] class: 0.0 | | | | | | | | |--- avg_price_per_room > 92.42 | | | | | | | | | |--- avg_price_per_room <= 107.29 | | | | | | | | | | |--- arrival_date <= 28.50 | | | | | | | | | | | |--- weights: [36.00, 0.00] class: 0.0 | | | | | | | | | | |--- arrival_date > 28.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- avg_price_per_room > 107.29 | | | | | | | | | | |--- room_type_reserved_Room_Type 2 <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 12 | | | | | | | | | | |--- room_type_reserved_Room_Type 2 > 0.50 | | | | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | | |--- arrival_month > 11.50 | | | | | | | | |--- weights: [34.00, 0.00] class: 0.0 | | | | | | |--- no_of_weekend_nights > 4.50 | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | |--- no_of_special_requests > 2.50 | | | | | | |--- weights: [70.00, 0.00] class: 0.0 | | | |--- lead_time > 90.50 | | | | |--- no_of_special_requests <= 2.50 | | | | | |--- arrival_month <= 8.50 | | | | | | |--- lead_time <= 150.50 | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | |--- arrival_month <= 7.50 | | | | | | | | | |--- arrival_date <= 4.50 | | | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | | | | | |--- arrival_date > 4.50 | | | | | | | | | | |--- arrival_date <= 26.00 | | | | | | | | | | | |--- weights: [0.00, 5.00] class: 1.0 | | | | | | | | | | |--- arrival_date > 26.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | |--- arrival_month > 7.50 | | | | | | | | | |--- arrival_date <= 14.00 | | | | | | | | | | |--- weights: [8.00, 0.00] class: 0.0 | | | | | | | | | |--- arrival_date > 14.00 | | | | | | | | | | |--- market_segment_type_Offline <= 0.50 | | | | | | | | | | | |--- weights: [0.00, 2.00] class: 1.0 | | | | | | | | | | |--- market_segment_type_Offline > 0.50 | | | | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | |--- no_of_children <= 0.50 | | | | | | | | | |--- avg_price_per_room <= 157.50 | | | | | | | | | | |--- arrival_date <= 29.50 | | | | | | | | | | | |--- truncated branch of depth 8 | | | | | | | | | | |--- arrival_date > 29.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- avg_price_per_room > 157.50 | | | | | | | | | | |--- arrival_date <= 12.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- arrival_date > 12.50 | | | | | | | | | | | |--- weights: [7.00, 0.00] class: 0.0 | | | | | | | | |--- no_of_children > 0.50 | | | | | | | | | |--- no_of_week_nights <= 4.50 | | | | | | | | | | |--- no_of_children <= 2.50 | | | | | | | | | | | |--- truncated branch of depth 8 | | | | | | | | | | |--- no_of_children > 2.50 | | | | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | | | | |--- no_of_week_nights > 4.50 | | | | | | | | | | |--- arrival_month <= 6.50 | | | | | | | | | | | |--- weights: [2.00, 0.00] class: 0.0 | | | | | | | | | | |--- arrival_month > 6.50 | | | | | | | | | | | |--- weights: [0.00, 3.00] class: 1.0 | | | | | | |--- lead_time > 150.50 | | | | | | | |--- avg_price_per_room <= 103.50 | | | | | | | | |--- weights: [2.00, 0.00] class: 0.0 | | | | | | | |--- avg_price_per_room > 103.50 | | | | | | | | |--- weights: [0.00, 5.00] class: 1.0 | | | | | |--- arrival_month > 8.50 | | | | | | |--- avg_price_per_room <= 90.42 | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | |--- arrival_date <= 21.50 | | | | | | | | | |--- room_type_reserved_Room_Type 2 <= 0.50 | | | | | | | | | | |--- lead_time <= 123.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- lead_time > 123.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- room_type_reserved_Room_Type 2 > 0.50 | | | | | | | | | | |--- weights: [2.00, 0.00] class: 0.0 | | | | | | | | |--- arrival_date > 21.50 | | | | | | | | | |--- avg_price_per_room <= 90.12 | | | | | | | | | | |--- type_of_meal_plan <= 0.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- type_of_meal_plan > 0.00 | | | | | | | | | | | |--- weights: [8.00, 0.00] class: 0.0 | | | | | | | | | |--- avg_price_per_room > 90.12 | | | | | | | | | | |--- weights: [0.00, 2.00] class: 1.0 | | | | | | | |--- arrival_month > 11.50 | | | | | | | | |--- lead_time <= 101.00 | | | | | | | | | |--- weights: [11.00, 0.00] class: 0.0 | | | | | | | | |--- lead_time > 101.00 | | | | | | | | | |--- arrival_date <= 7.50 | | | | | | | | | | |--- weights: [0.00, 2.00] class: 1.0 | | | | | | | | | |--- arrival_date > 7.50 | | | | | | | | | | |--- arrival_date <= 21.50 | | | | | | | | | | | |--- weights: [7.00, 0.00] class: 0.0 | | | | | | | | | | |--- arrival_date > 21.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | |--- avg_price_per_room > 90.42 | | | | | | | |--- no_of_adults <= 1.50 | | | | | | | | |--- weights: [11.00, 0.00] class: 0.0 | | | | | | | |--- no_of_adults > 1.50 | | | | | | | | |--- avg_price_per_room <= 153.15 | | | | | | | | | |--- no_of_weekend_nights <= 1.50 | | | | | | | | | | |--- no_of_week_nights <= 2.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- no_of_week_nights > 2.50 | | | | | | | | | | | |--- truncated branch of depth 7 | | | | | | | | | |--- no_of_weekend_nights > 1.50 | | | | | | | | | | |--- lead_time <= 148.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- lead_time > 148.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | |--- avg_price_per_room > 153.15 | | | | | | | | | |--- arrival_date <= 26.50 | | | | | | | | | | |--- lead_time <= 148.50 | | | | | | | | | | | |--- weights: [14.00, 0.00] class: 0.0 | | | | | | | | | | |--- lead_time > 148.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- arrival_date > 26.50 | | | | | | | | | | |--- room_type_reserved_Room_Type 4 <= 0.50 | | | | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | | | | | | |--- room_type_reserved_Room_Type 4 > 0.50 | | | | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | |--- no_of_special_requests > 2.50 | | | | | |--- weights: [90.00, 0.00] class: 0.0 |--- lead_time > 151.50 | |--- avg_price_per_room <= 100.04 | | |--- no_of_special_requests <= 0.50 | | | |--- market_segment_type_Online <= 0.50 | | | | |--- no_of_adults <= 1.50 | | | | | |--- lead_time <= 163.50 | | | | | | |--- lead_time <= 162.50 | | | | | | | |--- avg_price_per_room <= 62.50 | | | | | | | | |--- weights: [1.00, 1.00] class: 0.0 | | | | | | | |--- avg_price_per_room > 62.50 | | | | | | | | |--- weights: [4.00, 0.00] class: 0.0 | | | | | | |--- lead_time > 162.50 | | | | | | | |--- weights: [0.00, 15.00] class: 1.0 | | | | | |--- lead_time > 163.50 | | | | | | |--- no_of_week_nights <= 4.50 | | | | | | | |--- lead_time <= 173.00 | | | | | | | | |--- arrival_date <= 3.50 | | | | | | | | | |--- no_of_weekend_nights <= 0.50 | | | | | | | | | | |--- no_of_week_nights <= 1.50 | | | | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | | | | | | |--- no_of_week_nights > 1.50 | | | | | | | | | | | |--- weights: [61.00, 6.00] class: 0.0 | | | | | | | | | |--- no_of_weekend_nights > 0.50 | | | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | | | | |--- arrival_date > 3.50 | | | | | | | | | |--- avg_price_per_room <= 70.85 | | | | | | | | | | |--- weights: [3.00, 0.00] class: 0.0 | | | | | | | | | |--- avg_price_per_room > 70.85 | | | | | | | | | | |--- weights: [0.00, 9.00] class: 1.0 | | | | | | | |--- lead_time > 173.00 | | | | | | | | |--- arrival_month <= 4.50 | | | | | | | | | |--- weights: [0.00, 2.00] class: 1.0 | | | | | | | | |--- arrival_month > 4.50 | | | | | | | | | |--- no_of_week_nights <= 0.50 | | | | | | | | | | |--- arrival_date <= 17.00 | | | | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | | | | | |--- arrival_date > 17.00 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- no_of_week_nights > 0.50 | | | | | | | | | | |--- arrival_month <= 5.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- arrival_month > 5.50 | | | | | | | | | | | |--- truncated branch of depth 9 | | | | | | |--- no_of_week_nights > 4.50 | | | | | | | |--- lead_time <= 283.25 | | | | | | | | |--- weights: [2.00, 0.00] class: 0.0 | | | | | | | |--- lead_time > 283.25 | | | | | | | | |--- avg_price_per_room <= 88.33 | | | | | | | | | |--- weights: [0.00, 7.00] class: 1.0 | | | | | | | | |--- avg_price_per_room > 88.33 | | | | | | | | | |--- weights: [1.00, 1.00] class: 0.0 | | | | |--- no_of_adults > 1.50 | | | | | |--- avg_price_per_room <= 84.58 | | | | | | |--- lead_time <= 244.00 | | | | | | | |--- no_of_week_nights <= 1.50 | | | | | | | | |--- no_of_weekend_nights <= 1.50 | | | | | | | | | |--- lead_time <= 166.50 | | | | | | | | | | |--- weights: [3.00, 0.00] class: 0.0 | | | | | | | | | |--- lead_time > 166.50 | | | | | | | | | | |--- arrival_date <= 19.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- arrival_date > 19.00 | | | | | | | | | | | |--- weights: [2.00, 0.00] class: 0.0 | | | | | | | | |--- no_of_weekend_nights > 1.50 | | | | | | | | | |--- weights: [24.00, 0.00] class: 0.0 | | | | | | | |--- no_of_week_nights > 1.50 | | | | | | | | |--- avg_price_per_room <= 66.50 | | | | | | | | | |--- no_of_weekend_nights <= 0.50 | | | | | | | | | | |--- avg_price_per_room <= 64.80 | | | | | | | | | | | |--- weights: [2.00, 0.00] class: 0.0 | | | | | | | | | | |--- avg_price_per_room > 64.80 | | | | | | | | | | | |--- weights: [0.00, 7.00] class: 1.0 | | | | | | | | | |--- no_of_weekend_nights > 0.50 | | | | | | | | | | |--- avg_price_per_room <= 28.57 | | | | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | | | | | |--- avg_price_per_room > 28.57 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | |--- avg_price_per_room > 66.50 | | | | | | | | | |--- type_of_meal_plan <= 1.50 | | | | | | | | | | |--- avg_price_per_room <= 75.75 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- avg_price_per_room > 75.75 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | |--- type_of_meal_plan > 1.50 | | | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | |--- lead_time > 244.00 | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | |--- weights: [34.00, 0.00] class: 0.0 | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | |--- avg_price_per_room <= 80.38 | | | | | | | | | | |--- no_of_week_nights <= 3.50 | | | | | | | | | | | |--- truncated branch of depth 8 | | | | | | | | | | |--- no_of_week_nights > 3.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | |--- avg_price_per_room > 80.38 | | | | | | | | | | |--- weights: [11.00, 0.00] class: 0.0 | | | | | | | |--- arrival_month > 11.50 | | | | | | | | |--- weights: [37.00, 0.00] class: 0.0 | | | | | |--- avg_price_per_room > 84.58 | | | | | | |--- arrival_month <= 11.50 | | | | | | | |--- no_of_weekend_nights <= 1.50 | | | | | | | | |--- room_type_reserved_Room_Type 1 <= 0.50 | | | | | | | | | |--- weights: [4.00, 0.00] class: 0.0 | | | | | | | | |--- room_type_reserved_Room_Type 1 > 0.50 | | | | | | | | | |--- market_segment_type_Corporate <= 0.50 | | | | | | | | | | |--- no_of_adults <= 2.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- no_of_adults > 2.50 | | | | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | | | | | |--- market_segment_type_Corporate > 0.50 | | | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | | | |--- no_of_weekend_nights > 1.50 | | | | | | | | |--- arrival_month <= 6.50 | | | | | | | | | |--- weights: [0.00, 13.00] class: 1.0 | | | | | | | | |--- arrival_month > 6.50 | | | | | | | | | |--- weights: [14.00, 0.00] class: 0.0 | | | | | | |--- arrival_month > 11.50 | | | | | | | |--- weights: [9.00, 0.00] class: 0.0 | | | |--- market_segment_type_Online > 0.50 | | | | |--- avg_price_per_room <= 35.17 | | | | | |--- no_of_weekend_nights <= 0.50 | | | | | | |--- lead_time <= 205.00 | | | | | | | |--- arrival_month <= 6.00 | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | | |--- arrival_month > 6.00 | | | | | | | | |--- weights: [2.00, 0.00] class: 0.0 | | | | | | |--- lead_time > 205.00 | | | | | | | |--- arrival_date <= 19.00 | | | | | | | | |--- weights: [0.00, 5.00] class: 1.0 | | | | | | | |--- arrival_date > 19.00 | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | |--- no_of_weekend_nights > 0.50 | | | | | | |--- weights: [9.00, 0.00] class: 0.0 | | | | |--- avg_price_per_room > 35.17 | | | | | |--- arrival_month <= 11.50 | | | | | | |--- weights: [0.00, 523.00] class: 1.0 | | | | | |--- arrival_month > 11.50 | | | | | | |--- no_of_weekend_nights <= 0.50 | | | | | | | |--- lead_time <= 263.50 | | | | | | | | |--- avg_price_per_room <= 76.87 | | | | | | | | | |--- no_of_week_nights <= 1.50 | | | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | | | | | |--- no_of_week_nights > 1.50 | | | | | | | | | | |--- weights: [0.00, 5.00] class: 1.0 | | | | | | | | |--- avg_price_per_room > 76.87 | | | | | | | | | |--- weights: [3.00, 0.00] class: 0.0 | | | | | | | |--- lead_time > 263.50 | | | | | | | | |--- weights: [0.00, 7.00] class: 1.0 | | | | | | |--- no_of_weekend_nights > 0.50 | | | | | | | |--- no_of_week_nights <= 1.50 | | | | | | | | |--- arrival_date <= 3.50 | | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | | | | |--- arrival_date > 3.50 | | | | | | | | | |--- weights: [0.00, 6.00] class: 1.0 | | | | | | | |--- no_of_week_nights > 1.50 | | | | | | | | |--- weights: [0.00, 58.00] class: 1.0 | | |--- no_of_special_requests > 0.50 | | | |--- no_of_weekend_nights <= 0.50 | | | | |--- lead_time <= 180.50 | | | | | |--- lead_time <= 159.50 | | | | | | |--- arrival_month <= 8.50 | | | | | | | |--- weights: [8.00, 0.00] class: 0.0 | | | | | | |--- arrival_month > 8.50 | | | | | | | |--- arrival_date <= 23.50 | | | | | | | | |--- lead_time <= 156.50 | | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | | | |--- lead_time > 156.50 | | | | | | | | | |--- weights: [2.00, 0.00] class: 0.0 | | | | | | | |--- arrival_date > 23.50 | | | | | | | | |--- weights: [0.00, 4.00] class: 1.0 | | | | | |--- lead_time > 159.50 | | | | | | |--- no_of_adults <= 0.50 | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | |--- no_of_adults > 0.50 | | | | | | | |--- arrival_date <= 1.50 | | | | | | | | |--- lead_time <= 176.50 | | | | | | | | | |--- weights: [2.00, 0.00] class: 0.0 | | | | | | | | |--- lead_time > 176.50 | | | | | | | | | |--- weights: [0.00, 2.00] class: 1.0 | | | | | | | |--- arrival_date > 1.50 | | | | | | | | |--- weights: [48.00, 0.00] class: 0.0 | | | | |--- lead_time > 180.50 | | | | | |--- market_segment_type_Offline <= 0.50 | | | | | | |--- no_of_special_requests <= 2.50 | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | |--- no_of_week_nights <= 0.50 | | | | | | | | | |--- weights: [2.00, 0.00] class: 0.0 | | | | | | | | |--- no_of_week_nights > 0.50 | | | | | | | | | |--- weights: [0.00, 125.00] class: 1.0 | | | | | | | |--- arrival_month > 11.50 | | | | | | | | |--- lead_time <= 272.00 | | | | | | | | | |--- lead_time <= 226.50 | | | | | | | | | | |--- weights: [0.00, 3.00] class: 1.0 | | | | | | | | | |--- lead_time > 226.50 | | | | | | | | | | |--- weights: [6.00, 0.00] class: 0.0 | | | | | | | | |--- lead_time > 272.00 | | | | | | | | | |--- avg_price_per_room <= 73.10 | | | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | | | | | |--- avg_price_per_room > 73.10 | | | | | | | | | | |--- arrival_date <= 23.50 | | | | | | | | | | | |--- weights: [0.00, 6.00] class: 1.0 | | | | | | | | | | |--- arrival_date > 23.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | |--- no_of_special_requests > 2.50 | | | | | | | |--- weights: [12.00, 0.00] class: 0.0 | | | | | |--- market_segment_type_Offline > 0.50 | | | | | | |--- avg_price_per_room <= 96.37 | | | | | | | |--- no_of_adults <= 1.50 | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | | |--- no_of_adults > 1.50 | | | | | | | | |--- lead_time <= 279.75 | | | | | | | | | |--- weights: [15.00, 0.00] class: 0.0 | | | | | | | | |--- lead_time > 279.75 | | | | | | | | | |--- weights: [2.00, 1.00] class: 0.0 | | | | | | |--- avg_price_per_room > 96.37 | | | | | | | |--- weights: [0.00, 2.00] class: 1.0 | | | |--- no_of_weekend_nights > 0.50 | | | | |--- market_segment_type_Online <= 0.50 | | | | | |--- no_of_week_nights <= 5.50 | | | | | | |--- lead_time <= 284.25 | | | | | | | |--- arrival_date <= 30.00 | | | | | | | | |--- weights: [119.00, 0.00] class: 0.0 | | | | | | | |--- arrival_date > 30.00 | | | | | | | | |--- no_of_week_nights <= 3.00 | | | | | | | | | |--- weights: [3.00, 0.00] class: 0.0 | | | | | | | | |--- no_of_week_nights > 3.00 | | | | | | | | | |--- weights: [2.00, 1.00] class: 0.0 | | | | | | |--- lead_time > 284.25 | | | | | | | |--- arrival_date <= 21.00 | | | | | | | | |--- no_of_week_nights <= 2.50 | | | | | | | | | |--- weights: [13.00, 0.00] class: 0.0 | | | | | | | | |--- no_of_week_nights > 2.50 | | | | | | | | | |--- arrival_month <= 9.50 | | | | | | | | | | |--- weights: [5.00, 0.00] class: 0.0 | | | | | | | | | |--- arrival_month > 9.50 | | | | | | | | | | |--- avg_price_per_room <= 58.50 | | | | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | | | | | | |--- avg_price_per_room > 58.50 | | | | | | | | | | | |--- weights: [6.00, 2.00] class: 0.0 | | | | | | | |--- arrival_date > 21.00 | | | | | | | | |--- weights: [1.00, 1.00] class: 0.0 | | | | | |--- no_of_week_nights > 5.50 | | | | | | |--- arrival_month <= 7.50 | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | |--- arrival_month > 7.50 | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | |--- market_segment_type_Online > 0.50 | | | | | |--- arrival_month <= 11.50 | | | | | | |--- no_of_weekend_nights <= 3.50 | | | | | | | |--- arrival_date <= 27.50 | | | | | | | | |--- avg_price_per_room <= 81.12 | | | | | | | | | |--- lead_time <= 153.50 | | | | | | | | | | |--- no_of_weekend_nights <= 1.50 | | | | | | | | | | | |--- weights: [0.00, 2.00] class: 1.0 | | | | | | | | | | |--- no_of_weekend_nights > 1.50 | | | | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | | | | | |--- lead_time > 153.50 | | | | | | | | | | |--- lead_time <= 157.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- lead_time > 157.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | |--- avg_price_per_room > 81.12 | | | | | | | | | |--- lead_time <= 231.50 | | | | | | | | | | |--- type_of_meal_plan <= 0.00 | | | | | | | | | | | |--- truncated branch of depth 7 | | | | | | | | | | |--- type_of_meal_plan > 0.00 | | | | | | | | | | | |--- truncated branch of depth 14 | | | | | | | | | |--- lead_time > 231.50 | | | | | | | | | | |--- no_of_children <= 1.50 | | | | | | | | | | | |--- truncated branch of depth 9 | | | | | | | | | | |--- no_of_children > 1.50 | | | | | | | | | | | |--- weights: [0.00, 3.00] class: 1.0 | | | | | | | |--- arrival_date > 27.50 | | | | | | | | |--- no_of_week_nights <= 1.50 | | | | | | | | | |--- lead_time <= 224.50 | | | | | | | | | | |--- lead_time <= 175.50 | | | | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | | | | | | |--- lead_time > 175.50 | | | | | | | | | | | |--- weights: [0.00, 10.00] class: 1.0 | | | | | | | | | |--- lead_time > 224.50 | | | | | | | | | | |--- weights: [4.00, 0.00] class: 0.0 | | | | | | | | |--- no_of_week_nights > 1.50 | | | | | | | | | |--- lead_time <= 269.00 | | | | | | | | | | |--- lead_time <= 176.00 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- lead_time > 176.00 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | |--- lead_time > 269.00 | | | | | | | | | | |--- weights: [0.00, 3.00] class: 1.0 | | | | | | |--- no_of_weekend_nights > 3.50 | | | | | | | |--- room_type_reserved_Room_Type 4 <= 0.50 | | | | | | | | |--- no_of_week_nights <= 5.50 | | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | | | | |--- no_of_week_nights > 5.50 | | | | | | | | | |--- weights: [0.00, 6.00] class: 1.0 | | | | | | | |--- room_type_reserved_Room_Type 4 > 0.50 | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | |--- arrival_month > 11.50 | | | | | | |--- arrival_date <= 14.50 | | | | | | | |--- arrival_date <= 3.00 | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | | |--- arrival_date > 3.00 | | | | | | | | |--- lead_time <= 217.50 | | | | | | | | | |--- weights: [8.00, 0.00] class: 0.0 | | | | | | | | |--- lead_time > 217.50 | | | | | | | | | |--- lead_time <= 235.50 | | | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | | | | |--- lead_time > 235.50 | | | | | | | | | | |--- weights: [3.00, 0.00] class: 0.0 | | | | | | |--- arrival_date > 14.50 | | | | | | | |--- avg_price_per_room <= 55.92 | | | | | | | | |--- weights: [3.00, 0.00] class: 0.0 | | | | | | | |--- avg_price_per_room > 55.92 | | | | | | | | |--- avg_price_per_room <= 80.19 | | | | | | | | | |--- no_of_special_requests <= 2.50 | | | | | | | | | | |--- avg_price_per_room <= 68.32 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- avg_price_per_room > 68.32 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- no_of_special_requests > 2.50 | | | | | | | | | | |--- weights: [1.00, 0.00] class: 0.0 | | | | | | | | |--- avg_price_per_room > 80.19 | | | | | | | | | |--- no_of_week_nights <= 3.50 | | | | | | | | | | |--- avg_price_per_room <= 89.75 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- avg_price_per_room > 89.75 | | | | | | | | | | | |--- weights: [0.00, 3.00] class: 1.0 | | | | | | | | | |--- no_of_week_nights > 3.50 | | | | | | | | | | |--- weights: [4.00, 0.00] class: 0.0 | |--- avg_price_per_room > 100.04 | | |--- arrival_month <= 11.50 | | | |--- no_of_special_requests <= 2.50 | | | | |--- weights: [0.00, 2108.00] class: 1.0 | | | |--- no_of_special_requests > 2.50 | | | | |--- weights: [31.00, 0.00] class: 0.0 | | |--- arrival_month > 11.50 | | | |--- no_of_special_requests <= 0.50 | | | | |--- weights: [47.00, 0.00] class: 0.0 | | | |--- no_of_special_requests > 0.50 | | | | |--- arrival_date <= 24.50 | | | | | |--- weights: [5.00, 0.00] class: 0.0 | | | | |--- arrival_date > 24.50 | | | | | |--- room_type_reserved_Room_Type 1 <= 0.50 | | | | | | |--- lead_time <= 172.50 | | | | | | | |--- no_of_children <= 1.00 | | | | | | | | |--- weights: [2.00, 0.00] class: 0.0 | | | | | | | |--- no_of_children > 1.00 | | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | |--- lead_time > 172.50 | | | | | | | |--- weights: [0.00, 13.00] class: 1.0 | | | | | |--- room_type_reserved_Room_Type 1 > 0.50 | | | | | | |--- avg_price_per_room <= 113.47 | | | | | | | |--- weights: [0.00, 1.00] class: 1.0 | | | | | | |--- avg_price_per_room > 113.47 | | | | | | | |--- weights: [3.00, 0.00] class: 0.0
# importance of features in the tree building ( The importance of a feature is computed as the
# (normalized) total reduction of the criterion brought by that feature. It is also known as the Gini importance )
print(
pd.DataFrame(
dTree.feature_importances_, columns=["Imp"], index=X_train.columns
).sort_values(by="Imp", ascending=False)
)
Imp lead_time 0.350426 avg_price_per_room 0.169167 market_segment_type_Online 0.093218 arrival_date 0.085588 no_of_special_requests 0.067350 arrival_month 0.063954 no_of_week_nights 0.044595 no_of_weekend_nights 0.041870 no_of_adults 0.027230 type_of_meal_plan 0.014130 arrival_year 0.012097 required_car_parking_space 0.006614 no_of_children 0.005075 room_type_reserved_Room_Type 4 0.004837 room_type_reserved_Room_Type 1 0.004795 market_segment_type_Offline 0.002930 room_type_reserved_Room_Type 2 0.001758 room_type_reserved_Room_Type 5 0.000983 market_segment_type_Corporate 0.000784 room_type_reserved_Room_Type 7 0.000773 repeated_guest 0.000638 market_segment_type_Aviation 0.000412 room_type_reserved_Room_Type 6 0.000406 no_of_previous_bookings_not_canceled 0.000212 no_of_previous_cancellations 0.000091 market_segment_type_Complementary 0.000054 room_type_reserved_Room_Type 3 0.000013
importances = dTree.feature_importances_
indices = np.argsort(importances)
plt.figure(figsize=(12, 12))
plt.title("Feature Importances")
plt.barh(range(len(indices)), importances[indices], color="violet", align="center")
plt.yticks(range(len(indices)), [feature_names[i] for i in indices])
plt.xlabel("Relative Importance")
plt.show()
The tree above is very complex, such a tree often overfits.
# Choose the type of classifier.
estimator = DecisionTreeClassifier(random_state=1, class_weight="balanced")
# Grid of parameters to choose from
parameters = {
"max_depth": [5, 10, 15, None],
"max_leaf_nodes": [50, 75, 150, 250],
"splitter": ["best", "random"],
"criterion": ["entropy", "gini"],
"min_impurity_decrease": [0.00001, 0.0001, 0.01],
}
# Type of scoring used to compare parameter combinations
acc_scorer = make_scorer(f1_score)
# Run the grid search
grid_obj = GridSearchCV(estimator, parameters, scoring=acc_scorer, cv=5)
grid_obj = grid_obj.fit(X_train, y_train)
# Set the clf to the best combination of parameters
estimator = grid_obj.best_estimator_
# Fit the best algorithm to the data.
estimator.fit(X_train, y_train)
DecisionTreeClassifier(class_weight='balanced', criterion='entropy',
max_leaf_nodes=250, min_impurity_decrease=1e-05,
random_state=1)
confusion_matrix_sklearn(estimator, X_train, y_train)
decision_tree_tune_perf_train = model_performance_classification_sklearn(
estimator, X_train, y_train
)
decision_tree_tune_perf_train
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.879923 | 0.856989 | 0.794568 | 0.824599 |
confusion_matrix_sklearn(estimator, X_test, y_test)
decision_tree_tune_perf_test = model_performance_classification_sklearn(
estimator, X_test, y_test
)
decision_tree_tune_perf_test
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.869981 | 0.839012 | 0.777018 | 0.806826 |
feature_names = list(X.columns)
plt.figure(figsize=(20, 30))
tree.plot_tree(
estimator,
feature_names=feature_names,
filled=True,
fontsize=9,
node_ids=True,
class_names=True,
)
plt.show()
# Text report showing the rules of a decision tree -
print(tree.export_text(estimator, feature_names=feature_names, show_weights=True))
|--- lead_time <= 151.50 | |--- no_of_special_requests <= 0.50 | | |--- market_segment_type_Online <= 0.50 | | | |--- lead_time <= 74.50 | | | | |--- arrival_month <= 9.50 | | | | | |--- no_of_weekend_nights <= 1.50 | | | | | | |--- market_segment_type_Offline <= 0.50 | | | | | | | |--- repeated_guest <= 0.50 | | | | | | | | |--- market_segment_type_Complementary <= 0.50 | | | | | | | | | |--- arrival_date <= 11.50 | | | | | | | | | | |--- lead_time <= 30.00 | | | | | | | | | | | |--- weights: [91.70, 9.11] class: 0.0 | | | | | | | | | | |--- lead_time > 30.00 | | | | | | | | | | | |--- weights: [9.69, 7.59] class: 0.0 | | | | | | | | | |--- arrival_date > 11.50 | | | | | | | | | | |--- no_of_adults <= 1.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- no_of_adults > 1.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | |--- market_segment_type_Complementary > 0.50 | | | | | | | | | |--- weights: [39.51, 0.00] class: 0.0 | | | | | | | |--- repeated_guest > 0.50 | | | | | | | | |--- lead_time <= 44.50 | | | | | | | | | |--- weights: [106.61, 0.00] class: 0.0 | | | | | | | | |--- lead_time > 44.50 | | | | | | | | | |--- weights: [2.24, 3.04] class: 1.0 | | | | | | |--- market_segment_type_Offline > 0.50 | | | | | | | |--- avg_price_per_room <= 178.44 | | | | | | | | |--- no_of_weekend_nights <= 0.50 | | | | | | | | | |--- weights: [697.09, 0.00] class: 0.0 | | | | | | | | |--- no_of_weekend_nights > 0.50 | | | | | | | | | |--- avg_price_per_room <= 48.17 | | | | | | | | | | |--- avg_price_per_room <= 46.58 | | | | | | | | | | | |--- weights: [5.22, 1.52] class: 0.0 | | | | | | | | | | |--- avg_price_per_room > 46.58 | | | | | | | | | | | |--- weights: [0.00, 10.63] class: 1.0 | | | | | | | | | |--- avg_price_per_room > 48.17 | | | | | | | | | | |--- lead_time <= 3.50 | | | | | | | | | | | |--- weights: [35.79, 0.00] class: 0.0 | | | | | | | | | | |--- lead_time > 3.50 | | | | | | | | | | | |--- weights: [148.36, 36.43] class: 0.0 | | | | | | | |--- avg_price_per_room > 178.44 | | | | | | | | |--- lead_time <= 22.50 | | | | | | | | | |--- weights: [2.98, 0.00] class: 0.0 | | | | | | | | |--- lead_time > 22.50 | | | | | | | | | |--- lead_time <= 37.50 | | | | | | | | | | |--- weights: [0.00, 24.29] class: 1.0 | | | | | | | | | |--- lead_time > 37.50 | | | | | | | | | | |--- weights: [1.49, 0.00] class: 0.0 | | | | | |--- no_of_weekend_nights > 1.50 | | | | | | |--- no_of_adults <= 1.50 | | | | | | | |--- arrival_month <= 6.50 | | | | | | | | |--- market_segment_type_Corporate <= 0.50 | | | | | | | | | |--- arrival_month <= 1.50 | | | | | | | | | | |--- weights: [5.96, 0.00] class: 0.0 | | | | | | | | | |--- arrival_month > 1.50 | | | | | | | | | | |--- avg_price_per_room <= 102.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- avg_price_per_room > 102.50 | | | | | | | | | | | |--- weights: [0.00, 68.32] class: 1.0 | | | | | | | | |--- market_segment_type_Corporate > 0.50 | | | | | | | | | |--- avg_price_per_room <= 120.00 | | | | | | | | | | |--- weights: [14.91, 0.00] class: 0.0 | | | | | | | | | |--- avg_price_per_room > 120.00 | | | | | | | | | | |--- weights: [0.00, 3.04] class: 1.0 | | | | | | | |--- arrival_month > 6.50 | | | | | | | | |--- weights: [26.84, 4.55] class: 0.0 | | | | | | |--- no_of_adults > 1.50 | | | | | | | |--- lead_time <= 59.50 | | | | | | | | |--- weights: [175.95, 22.77] class: 0.0 | | | | | | | |--- lead_time > 59.50 | | | | | | | | |--- lead_time <= 63.00 | | | | | | | | | |--- arrival_date <= 19.50 | | | | | | | | | | |--- weights: [2.98, 0.00] class: 0.0 | | | | | | | | | |--- arrival_date > 19.50 | | | | | | | | | | |--- weights: [0.75, 12.14] class: 1.0 | | | | | | | | |--- lead_time > 63.00 | | | | | | | | | |--- weights: [18.64, 3.04] class: 0.0 | | | | |--- arrival_month > 9.50 | | | | | |--- market_segment_type_Offline <= 0.50 | | | | | | |--- arrival_month <= 11.50 | | | | | | | |--- repeated_guest <= 0.50 | | | | | | | | |--- arrival_month <= 10.50 | | | | | | | | | |--- weights: [84.25, 9.11] class: 0.0 | | | | | | | | |--- arrival_month > 10.50 | | | | | | | | | |--- weights: [59.64, 24.29] class: 0.0 | | | | | | | |--- repeated_guest > 0.50 | | | | | | | | |--- weights: [55.17, 0.00] class: 0.0 | | | | | | |--- arrival_month > 11.50 | | | | | | | |--- weights: [85.74, 0.00] class: 0.0 | | | | | |--- market_segment_type_Offline > 0.50 | | | | | | |--- no_of_weekend_nights <= 0.50 | | | | | | | |--- weights: [405.58, 0.00] class: 0.0 | | | | | | |--- no_of_weekend_nights > 0.50 | | | | | | | |--- avg_price_per_room <= 67.25 | | | | | | | | |--- weights: [87.98, 0.00] class: 0.0 | | | | | | | |--- avg_price_per_room > 67.25 | | | | | | | | |--- weights: [225.16, 18.22] class: 0.0 | | | |--- lead_time > 74.50 | | | | |--- arrival_month <= 11.50 | | | | | |--- arrival_month <= 1.50 | | | | | | |--- weights: [73.06, 0.00] class: 0.0 | | | | | |--- arrival_month > 1.50 | | | | | | |--- lead_time <= 99.50 | | | | | | | |--- no_of_weekend_nights <= 0.50 | | | | | | | | |--- lead_time <= 85.50 | | | | | | | | | |--- weights: [61.88, 0.00] class: 0.0 | | | | | | | | |--- lead_time > 85.50 | | | | | | | | | |--- weights: [99.16, 19.74] class: 0.0 | | | | | | | |--- no_of_weekend_nights > 0.50 | | | | | | | | |--- avg_price_per_room <= 99.98 | | | | | | | | | |--- lead_time <= 98.50 | | | | | | | | | | |--- avg_price_per_room <= 61.88 | | | | | | | | | | | |--- weights: [17.89, 0.00] class: 0.0 | | | | | | | | | | |--- avg_price_per_room > 61.88 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | |--- lead_time > 98.50 | | | | | | | | | | |--- weights: [0.00, 13.66] class: 1.0 | | | | | | | | |--- avg_price_per_room > 99.98 | | | | | | | | | |--- arrival_date <= 9.50 | | | | | | | | | | |--- weights: [11.18, 0.00] class: 0.0 | | | | | | | | | |--- arrival_date > 9.50 | | | | | | | | | | |--- no_of_week_nights <= 1.50 | | | | | | | | | | | |--- weights: [0.75, 68.32] class: 1.0 | | | | | | | | | | |--- no_of_week_nights > 1.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | |--- lead_time > 99.50 | | | | | | | |--- lead_time <= 116.50 | | | | | | | | |--- no_of_week_nights <= 1.50 | | | | | | | | | |--- no_of_weekend_nights <= 0.50 | | | | | | | | | | |--- no_of_adults <= 1.50 | | | | | | | | | | | |--- weights: [3.73, 1.52] class: 0.0 | | | | | | | | | | |--- no_of_adults > 1.50 | | | | | | | | | | | |--- weights: [0.75, 86.53] class: 1.0 | | | | | | | | | |--- no_of_weekend_nights > 0.50 | | | | | | | | | | |--- weights: [25.35, 92.61] class: 1.0 | | | | | | | | |--- no_of_week_nights > 1.50 | | | | | | | | | |--- market_segment_type_Corporate <= 0.50 | | | | | | | | | | |--- avg_price_per_room <= 109.50 | | | | | | | | | | | |--- truncated branch of depth 7 | | | | | | | | | | |--- avg_price_per_room > 109.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- market_segment_type_Corporate > 0.50 | | | | | | | | | | |--- weights: [0.75, 39.47] class: 1.0 | | | | | | | |--- lead_time > 116.50 | | | | | | | | |--- no_of_adults <= 1.50 | | | | | | | | | |--- avg_price_per_room <= 122.00 | | | | | | | | | | |--- weights: [64.86, 0.00] class: 0.0 | | | | | | | | | |--- avg_price_per_room > 122.00 | | | | | | | | | | |--- weights: [0.00, 3.04] class: 1.0 | | | | | | | | |--- no_of_adults > 1.50 | | | | | | | | | |--- no_of_week_nights <= 2.50 | | | | | | | | | | |--- arrival_date <= 27.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- arrival_date > 27.50 | | | | | | | | | | | |--- weights: [14.17, 1.52] class: 0.0 | | | | | | | | | |--- no_of_week_nights > 2.50 | | | | | | | | | | |--- avg_price_per_room <= 74.12 | | | | | | | | | | | |--- weights: [9.69, 10.63] class: 1.0 | | | | | | | | | | |--- avg_price_per_room > 74.12 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | |--- arrival_month > 11.50 | | | | | |--- avg_price_per_room <= 165.11 | | | | | | |--- weights: [99.16, 0.00] class: 0.0 | | | | | |--- avg_price_per_room > 165.11 | | | | | | |--- weights: [0.75, 1.52] class: 1.0 | | |--- market_segment_type_Online > 0.50 | | | |--- lead_time <= 13.50 | | | | |--- arrival_month <= 11.50 | | | | | |--- lead_time <= 3.50 | | | | | | |--- arrival_month <= 1.50 | | | | | | | |--- weights: [49.95, 0.00] class: 0.0 | | | | | | |--- arrival_month > 1.50 | | | | | | | |--- avg_price_per_room <= 178.78 | | | | | | | | |--- arrival_month <= 5.50 | | | | | | | | | |--- no_of_weekend_nights <= 1.50 | | | | | | | | | | |--- avg_price_per_room <= 77.50 | | | | | | | | | | | |--- weights: [17.89, 0.00] class: 0.0 | | | | | | | | | | |--- avg_price_per_room > 77.50 | | | | | | | | | | | |--- weights: [105.87, 51.62] class: 0.0 | | | | | | | | | |--- no_of_weekend_nights > 1.50 | | | | | | | | | | |--- arrival_date <= 23.50 | | | | | | | | | | | |--- weights: [8.20, 4.55] class: 0.0 | | | | | | | | | | |--- arrival_date > 23.50 | | | | | | | | | | | |--- weights: [0.00, 22.77] class: 1.0 | | | | | | | | |--- arrival_month > 5.50 | | | | | | | | | |--- weights: [225.90, 39.47] class: 0.0 | | | | | | | |--- avg_price_per_room > 178.78 | | | | | | | | |--- weights: [14.17, 25.81] class: 1.0 | | | | | |--- lead_time > 3.50 | | | | | | |--- avg_price_per_room <= 99.38 | | | | | | | |--- arrival_month <= 1.50 | | | | | | | | |--- weights: [44.73, 0.00] class: 0.0 | | | | | | | |--- arrival_month > 1.50 | | | | | | | | |--- avg_price_per_room <= 69.37 | | | | | | | | | |--- weights: [20.13, 0.00] class: 0.0 | | | | | | | | |--- avg_price_per_room > 69.37 | | | | | | | | | |--- no_of_week_nights <= 4.50 | | | | | | | | | | |--- weights: [87.23, 62.24] class: 0.0 | | | | | | | | | |--- no_of_week_nights > 4.50 | | | | | | | | | | |--- weights: [0.75, 10.63] class: 1.0 | | | | | | |--- avg_price_per_room > 99.38 | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | |--- weights: [23.86, 4.55] class: 0.0 | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | |--- arrival_month <= 8.50 | | | | | | | | | |--- weights: [57.41, 229.24] class: 1.0 | | | | | | | | |--- arrival_month > 8.50 | | | | | | | | | |--- arrival_date <= 14.00 | | | | | | | | | | |--- weights: [9.69, 36.43] class: 1.0 | | | | | | | | | |--- arrival_date > 14.00 | | | | | | | | | | |--- weights: [20.13, 10.63] class: 0.0 | | | | |--- arrival_month > 11.50 | | | | | |--- weights: [123.02, 0.00] class: 0.0 | | | |--- lead_time > 13.50 | | | | |--- avg_price_per_room <= 105.27 | | | | | |--- avg_price_per_room <= 59.43 | | | | | | |--- avg_price_per_room <= 28.50 | | | | | | | |--- weights: [25.35, 0.00] class: 0.0 | | | | | | |--- avg_price_per_room > 28.50 | | | | | | | |--- weights: [42.50, 22.77] class: 0.0 | | | | | |--- avg_price_per_room > 59.43 | | | | | | |--- lead_time <= 25.50 | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | |--- arrival_month <= 1.50 | | | | | | | | | |--- weights: [21.62, 0.00] class: 0.0 | | | | | | | | |--- arrival_month > 1.50 | | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | | |--- type_of_meal_plan <= 0.00 | | | | | | | | | | | |--- weights: [1.49, 3.04] class: 1.0 | | | | | | | | | | |--- type_of_meal_plan > 0.00 | | | | | | | | | | | |--- weights: [20.88, 0.00] class: 0.0 | | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | | |--- weights: [43.99, 126.00] class: 1.0 | | | | | | | |--- arrival_month > 11.50 | | | | | | | | |--- weights: [40.26, 0.00] class: 0.0 | | | | | | |--- lead_time > 25.50 | | | | | | | |--- required_car_parking_space <= 0.50 | | | | | | | | |--- type_of_meal_plan <= 0.00 | | | | | | | | | |--- weights: [99.16, 487.32] class: 1.0 | | | | | | | | |--- type_of_meal_plan > 0.00 | | | | | | | | | |--- type_of_meal_plan <= 1.50 | | | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | |--- type_of_meal_plan > 1.50 | | | | | | | | | | |--- no_of_week_nights <= 2.50 | | | | | | | | | | | |--- weights: [0.00, 53.13] class: 1.0 | | | | | | | | | | |--- no_of_week_nights > 2.50 | | | | | | | | | | | |--- weights: [2.24, 3.04] class: 1.0 | | | | | | | |--- required_car_parking_space > 0.50 | | | | | | | | |--- weights: [15.66, 0.00] class: 0.0 | | | | |--- avg_price_per_room > 105.27 | | | | | |--- required_car_parking_space <= 0.50 | | | | | | |--- arrival_month <= 10.50 | | | | | | | |--- avg_price_per_room <= 173.12 | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | |--- lead_time <= 71.00 | | | | | | | | | | |--- weights: [14.17, 6.07] class: 0.0 | | | | | | | | | |--- lead_time > 71.00 | | | | | | | | | | |--- weights: [0.00, 15.18] class: 1.0 | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | |--- room_type_reserved_Room_Type 5 <= 0.50 | | | | | | | | | | |--- arrival_date <= 24.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- arrival_date > 24.50 | | | | | | | | | | | |--- weights: [80.52, 343.09] class: 1.0 | | | | | | | | | |--- room_type_reserved_Room_Type 5 > 0.50 | | | | | | | | | | |--- arrival_date <= 27.50 | | | | | | | | | | | |--- weights: [11.18, 4.55] class: 0.0 | | | | | | | | | | |--- arrival_date > 27.50 | | | | | | | | | | | |--- weights: [0.00, 6.07] class: 1.0 | | | | | | | |--- avg_price_per_room > 173.12 | | | | | | | | |--- no_of_children <= 1.50 | | | | | | | | | |--- weights: [13.42, 89.57] class: 1.0 | | | | | | | | |--- no_of_children > 1.50 | | | | | | | | | |--- weights: [3.73, 154.85] class: 1.0 | | | | | | |--- arrival_month > 10.50 | | | | | | | |--- lead_time <= 22.50 | | | | | | | | |--- no_of_adults <= 1.50 | | | | | | | | | |--- weights: [0.75, 6.07] class: 1.0 | | | | | | | | |--- no_of_adults > 1.50 | | | | | | | | | |--- weights: [16.40, 0.00] class: 0.0 | | | | | | | |--- lead_time > 22.50 | | | | | | | | |--- avg_price_per_room <= 168.06 | | | | | | | | | |--- avg_price_per_room <= 147.75 | | | | | | | | | | |--- weights: [20.13, 62.24] class: 1.0 | | | | | | | | | |--- avg_price_per_room > 147.75 | | | | | | | | | | |--- weights: [0.00, 22.77] class: 1.0 | | | | | | | | |--- avg_price_per_room > 168.06 | | | | | | | | | |--- weights: [11.93, 6.07] class: 0.0 | | | | | |--- required_car_parking_space > 0.50 | | | | | | |--- no_of_week_nights <= 5.50 | | | | | | | |--- weights: [29.08, 0.00] class: 0.0 | | | | | | |--- no_of_week_nights > 5.50 | | | | | | | |--- weights: [0.00, 1.52] class: 1.0 | |--- no_of_special_requests > 0.50 | | |--- no_of_special_requests <= 1.50 | | | |--- market_segment_type_Online <= 0.50 | | | | |--- lead_time <= 71.50 | | | | | |--- avg_price_per_room <= 94.50 | | | | | | |--- weights: [474.92, 0.00] class: 0.0 | | | | | |--- avg_price_per_room > 94.50 | | | | | | |--- market_segment_type_Corporate <= 0.50 | | | | | | | |--- weights: [97.67, 0.00] class: 0.0 | | | | | | |--- market_segment_type_Corporate > 0.50 | | | | | | | |--- no_of_weekend_nights <= 0.50 | | | | | | | | |--- weights: [29.08, 1.52] class: 0.0 | | | | | | | |--- no_of_weekend_nights > 0.50 | | | | | | | | |--- weights: [3.73, 6.07] class: 1.0 | | | | |--- lead_time > 71.50 | | | | | |--- type_of_meal_plan <= 0.00 | | | | | | |--- weights: [0.00, 9.11] class: 1.0 | | | | | |--- type_of_meal_plan > 0.00 | | | | | | |--- lead_time <= 91.50 | | | | | | | |--- weights: [73.81, 0.00] class: 0.0 | | | | | | |--- lead_time > 91.50 | | | | | | | |--- no_of_week_nights <= 2.50 | | | | | | | | |--- weights: [37.28, 19.74] class: 0.0 | | | | | | | |--- no_of_week_nights > 2.50 | | | | | | | | |--- arrival_date <= 8.50 | | | | | | | | | |--- weights: [17.89, 4.55] class: 0.0 | | | | | | | | |--- arrival_date > 8.50 | | | | | | | | | |--- weights: [55.17, 0.00] class: 0.0 | | | |--- market_segment_type_Online > 0.50 | | | | |--- lead_time <= 6.50 | | | | | |--- avg_price_per_room <= 157.64 | | | | | | |--- weights: [556.18, 48.58] class: 0.0 | | | | | |--- avg_price_per_room > 157.64 | | | | | | |--- arrival_date <= 18.50 | | | | | | | |--- arrival_date <= 16.50 | | | | | | | | |--- weights: [38.02, 12.14] class: 0.0 | | | | | | | |--- arrival_date > 16.50 | | | | | | | | |--- weights: [1.49, 7.59] class: 1.0 | | | | | | |--- arrival_date > 18.50 | | | | | | | |--- weights: [39.51, 3.04] class: 0.0 | | | | |--- lead_time > 6.50 | | | | | |--- required_car_parking_space <= 0.50 | | | | | | |--- arrival_month <= 1.50 | | | | | | | |--- weights: [78.28, 0.00] class: 0.0 | | | | | | |--- arrival_month > 1.50 | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | |--- avg_price_per_room <= 118.55 | | | | | | | | | |--- no_of_weekend_nights <= 2.50 | | | | | | | | | | |--- avg_price_per_room <= 66.30 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- avg_price_per_room > 66.30 | | | | | | | | | | | |--- truncated branch of depth 8 | | | | | | | | | |--- no_of_weekend_nights > 2.50 | | | | | | | | | | |--- weights: [3.73, 27.33] class: 1.0 | | | | | | | | |--- avg_price_per_room > 118.55 | | | | | | | | | |--- arrival_month <= 8.50 | | | | | | | | | | |--- arrival_date <= 19.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- arrival_date > 19.50 | | | | | | | | | | | |--- weights: [174.46, 159.40] class: 0.0 | | | | | | | | | |--- arrival_month > 8.50 | | | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | | | |--- weights: [258.71, 344.61] class: 1.0 | | | | | | | |--- arrival_month > 11.50 | | | | | | | | |--- lead_time <= 100.00 | | | | | | | | | |--- no_of_weekend_nights <= 3.00 | | | | | | | | | | |--- weights: [242.30, 1.52] class: 0.0 | | | | | | | | | |--- no_of_weekend_nights > 3.00 | | | | | | | | | | |--- weights: [1.49, 3.04] class: 1.0 | | | | | | | | |--- lead_time > 100.00 | | | | | | | | | |--- avg_price_per_room <= 92.00 | | | | | | | | | | |--- weights: [13.42, 19.74] class: 1.0 | | | | | | | | | |--- avg_price_per_room > 92.00 | | | | | | | | | | |--- weights: [2.24, 30.36] class: 1.0 | | | | | |--- required_car_parking_space > 0.50 | | | | | | |--- no_of_weekend_nights <= 3.00 | | | | | | | |--- weights: [140.16, 0.00] class: 0.0 | | | | | | |--- no_of_weekend_nights > 3.00 | | | | | | | |--- weights: [0.00, 1.52] class: 1.0 | | |--- no_of_special_requests > 1.50 | | | |--- lead_time <= 90.50 | | | | |--- no_of_week_nights <= 3.50 | | | | | |--- weights: [1585.04, 0.00] class: 0.0 | | | | |--- no_of_week_nights > 3.50 | | | | | |--- no_of_special_requests <= 2.50 | | | | | | |--- lead_time <= 4.50 | | | | | | | |--- weights: [22.37, 0.00] class: 0.0 | | | | | | |--- lead_time > 4.50 | | | | | | | |--- weights: [158.06, 57.69] class: 0.0 | | | | | |--- no_of_special_requests > 2.50 | | | | | | |--- weights: [52.19, 0.00] class: 0.0 | | | |--- lead_time > 90.50 | | | | |--- no_of_special_requests <= 2.50 | | | | | |--- arrival_month <= 8.50 | | | | | | |--- lead_time <= 150.50 | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | |--- weights: [8.20, 12.14] class: 1.0 | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | |--- no_of_children <= 0.50 | | | | | | | | | |--- avg_price_per_room <= 157.50 | | | | | | | | | | |--- weights: [140.91, 13.66] class: 0.0 | | | | | | | | | |--- avg_price_per_room > 157.50 | | | | | | | | | | |--- weights: [6.71, 6.07] class: 0.0 | | | | | | | | |--- no_of_children > 0.50 | | | | | | | | | |--- weights: [27.59, 16.70] class: 0.0 | | | | | | |--- lead_time > 150.50 | | | | | | | |--- weights: [1.49, 7.59] class: 1.0 | | | | | |--- arrival_month > 8.50 | | | | | | |--- weights: [106.61, 106.27] class: 0.0 | | | | |--- no_of_special_requests > 2.50 | | | | | |--- weights: [67.10, 0.00] class: 0.0 |--- lead_time > 151.50 | |--- avg_price_per_room <= 100.04 | | |--- no_of_special_requests <= 0.50 | | | |--- market_segment_type_Online <= 0.50 | | | | |--- no_of_adults <= 1.50 | | | | | |--- lead_time <= 163.50 | | | | | | |--- avg_price_per_room <= 85.50 | | | | | | | |--- weights: [3.73, 1.52] class: 0.0 | | | | | | |--- avg_price_per_room > 85.50 | | | | | | | |--- weights: [0.00, 22.77] class: 1.0 | | | | | |--- lead_time > 163.50 | | | | | | |--- no_of_week_nights <= 4.50 | | | | | | | |--- lead_time <= 173.00 | | | | | | | | |--- arrival_date <= 3.50 | | | | | | | | | |--- weights: [46.97, 9.11] class: 0.0 | | | | | | | | |--- arrival_date > 3.50 | | | | | | | | | |--- no_of_weekend_nights <= 1.00 | | | | | | | | | | |--- weights: [0.00, 13.66] class: 1.0 | | | | | | | | | |--- no_of_weekend_nights > 1.00 | | | | | | | | | | |--- weights: [2.24, 0.00] class: 0.0 | | | | | | | |--- lead_time > 173.00 | | | | | | | | |--- arrival_month <= 5.50 | | | | | | | | | |--- weights: [6.71, 7.59] class: 1.0 | | | | | | | | |--- arrival_month > 5.50 | | | | | | | | | |--- lead_time <= 231.00 | | | | | | | | | | |--- weights: [79.03, 0.00] class: 0.0 | | | | | | | | | |--- lead_time > 231.00 | | | | | | | | | | |--- arrival_month <= 7.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- arrival_month > 7.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | |--- no_of_week_nights > 4.50 | | | | | | | |--- weights: [2.24, 12.14] class: 1.0 | | | | |--- no_of_adults > 1.50 | | | | | |--- arrival_month <= 11.50 | | | | | | |--- avg_price_per_room <= 88.50 | | | | | | | |--- lead_time <= 269.50 | | | | | | | | |--- no_of_weekend_nights <= 0.50 | | | | | | | | | |--- lead_time <= 168.50 | | | | | | | | | | |--- weights: [8.95, 1.52] class: 0.0 | | | | | | | | | |--- lead_time > 168.50 | | | | | | | | | | |--- arrival_date <= 19.50 | | | | | | | | | | | |--- weights: [4.47, 89.57] class: 1.0 | | | | | | | | | | |--- arrival_date > 19.50 | | | | | | | | | | | |--- weights: [2.24, 0.00] class: 0.0 | | | | | | | | |--- no_of_weekend_nights > 0.50 | | | | | | | | | |--- avg_price_per_room <= 83.72 | | | | | | | | | | |--- no_of_week_nights <= 1.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- no_of_week_nights > 1.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- avg_price_per_room > 83.72 | | | | | | | | | | |--- weights: [0.75, 34.92] class: 1.0 | | | | | | | |--- lead_time > 269.50 | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | |--- weights: [16.40, 0.00] class: 0.0 | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | |--- no_of_week_nights <= 3.50 | | | | | | | | | | |--- weights: [8.95, 214.05] class: 1.0 | | | | | | | | | |--- no_of_week_nights > 3.50 | | | | | | | | | | |--- avg_price_per_room <= 74.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- avg_price_per_room > 74.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | |--- avg_price_per_room > 88.50 | | | | | | | |--- room_type_reserved_Room_Type 4 <= 0.50 | | | | | | | | |--- no_of_adults <= 2.50 | | | | | | | | | |--- weights: [7.46, 458.47] class: 1.0 | | | | | | | | |--- no_of_adults > 2.50 | | | | | | | | | |--- weights: [2.98, 0.00] class: 0.0 | | | | | | | |--- room_type_reserved_Room_Type 4 > 0.50 | | | | | | | | |--- weights: [4.47, 0.00] class: 0.0 | | | | | |--- arrival_month > 11.50 | | | | | | |--- weights: [53.68, 0.00] class: 0.0 | | | |--- market_segment_type_Online > 0.50 | | | | |--- avg_price_per_room <= 35.17 | | | | | |--- no_of_weekend_nights <= 0.50 | | | | | | |--- weights: [2.24, 9.11] class: 1.0 | | | | | |--- no_of_weekend_nights > 0.50 | | | | | | |--- weights: [6.71, 0.00] class: 0.0 | | | | |--- avg_price_per_room > 35.17 | | | | | |--- arrival_month <= 11.50 | | | | | | |--- weights: [0.00, 793.97] class: 1.0 | | | | | |--- arrival_month > 11.50 | | | | | | |--- weights: [3.73, 115.38] class: 1.0 | | |--- no_of_special_requests > 0.50 | | | |--- no_of_weekend_nights <= 0.50 | | | | |--- lead_time <= 180.50 | | | | | |--- weights: [44.73, 12.14] class: 0.0 | | | | |--- lead_time > 180.50 | | | | | |--- no_of_special_requests <= 2.50 | | | | | | |--- market_segment_type_Online <= 0.50 | | | | | | | |--- weights: [12.67, 6.07] class: 0.0 | | | | | | |--- market_segment_type_Online > 0.50 | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | |--- avg_price_per_room <= 44.12 | | | | | | | | | |--- weights: [1.49, 0.00] class: 0.0 | | | | | | | | |--- avg_price_per_room > 44.12 | | | | | | | | | |--- weights: [0.00, 189.76] class: 1.0 | | | | | | | |--- arrival_month > 11.50 | | | | | | | | |--- weights: [5.96, 16.70] class: 1.0 | | | | | |--- no_of_special_requests > 2.50 | | | | | | |--- weights: [8.95, 0.00] class: 0.0 | | | |--- no_of_weekend_nights > 0.50 | | | | |--- market_segment_type_Online <= 0.50 | | | | | |--- arrival_date <= 13.50 | | | | | | |--- weights: [67.85, 0.00] class: 0.0 | | | | | |--- arrival_date > 13.50 | | | | | | |--- weights: [44.73, 7.59] class: 0.0 | | | | |--- market_segment_type_Online > 0.50 | | | | | |--- arrival_month <= 11.50 | | | | | | |--- avg_price_per_room <= 76.48 | | | | | | | |--- weights: [46.97, 4.55] class: 0.0 | | | | | | |--- avg_price_per_room > 76.48 | | | | | | | |--- weights: [184.15, 106.27] class: 0.0 | | | | | |--- arrival_month > 11.50 | | | | | | |--- weights: [19.38, 34.92] class: 1.0 | |--- avg_price_per_room > 100.04 | | |--- arrival_month <= 11.50 | | | |--- no_of_special_requests <= 2.50 | | | | |--- weights: [0.00, 3200.19] class: 1.0 | | | |--- no_of_special_requests > 2.50 | | | | |--- weights: [23.11, 0.00] class: 0.0 | | |--- arrival_month > 11.50 | | | |--- no_of_special_requests <= 0.50 | | | | |--- weights: [35.04, 0.00] class: 0.0 | | | |--- no_of_special_requests > 0.50 | | | | |--- arrival_date <= 24.50 | | | | | |--- weights: [3.73, 0.00] class: 0.0 | | | | |--- arrival_date > 24.50 | | | | | |--- weights: [3.73, 22.77] class: 1.0
# importance of features in the tree building ( The importance of a feature is computed as the
# (normalized) total reduction of the criterion brought by that feature. It is also known as the Gini importance )
print(
pd.DataFrame(
estimator.feature_importances_, columns=["Imp"], index=X_train.columns
).sort_values(by="Imp", ascending=False)
)
Imp lead_time 0.355371 market_segment_type_Online 0.138466 avg_price_per_room 0.131435 no_of_special_requests 0.126682 arrival_month 0.085446 no_of_weekend_nights 0.038676 no_of_adults 0.029807 no_of_week_nights 0.023794 arrival_date 0.020978 arrival_year 0.013810 required_car_parking_space 0.011676 type_of_meal_plan 0.006148 market_segment_type_Corporate 0.005068 market_segment_type_Offline 0.004843 repeated_guest 0.003012 room_type_reserved_Room_Type 4 0.001511 market_segment_type_Complementary 0.001281 no_of_children 0.001206 room_type_reserved_Room_Type 5 0.000792 room_type_reserved_Room_Type 7 0.000000 market_segment_type_Aviation 0.000000 room_type_reserved_Room_Type 3 0.000000 room_type_reserved_Room_Type 6 0.000000 room_type_reserved_Room_Type 2 0.000000 no_of_previous_bookings_not_canceled 0.000000 no_of_previous_cancellations 0.000000 room_type_reserved_Room_Type 1 0.000000
importances = estimator.feature_importances_
indices = np.argsort(importances)
plt.figure(figsize=(12, 12))
plt.title("Feature Importances")
plt.barh(range(len(indices)), importances[indices], color="violet", align="center")
plt.yticks(range(len(indices)), [feature_names[i] for i in indices])
plt.xlabel("Relative Importance")
plt.show()
clf = DecisionTreeClassifier(random_state=1, class_weight="balanced")
path = clf.cost_complexity_pruning_path(X_train, y_train)
ccp_alphas, impurities = abs(path.ccp_alphas), path.impurities
pd.DataFrame(path)
| ccp_alphas | impurities | |
|---|---|---|
| 0 | 0.000000e+00 | 0.008376 |
| 1 | 0.000000e+00 | 0.008376 |
| 2 | 1.303920e-20 | 0.008376 |
| 3 | 1.303920e-20 | 0.008376 |
| 4 | 1.303920e-20 | 0.008376 |
| ... | ... | ... |
| 1904 | 8.901596e-03 | 0.328058 |
| 1905 | 9.802243e-03 | 0.337860 |
| 1906 | 1.271875e-02 | 0.350579 |
| 1907 | 3.412090e-02 | 0.418821 |
| 1908 | 8.117914e-02 | 0.500000 |
1909 rows × 2 columns
fig, ax = plt.subplots(figsize=(10, 5))
ax.plot(ccp_alphas[:-1], impurities[:-1], marker="o", drawstyle="steps-post")
ax.set_xlabel("effective alpha")
ax.set_ylabel("total impurity of leaves")
ax.set_title("Total Impurity vs effective alpha for training set")
plt.show()
clfs = []
for ccp_alpha in ccp_alphas:
clf = DecisionTreeClassifier(
random_state=1, ccp_alpha=ccp_alpha, class_weight="balanced"
)
clf.fit(X_train, y_train)
clfs.append(clf)
print(
"Number of nodes in the last tree is: {} with ccp_alpha: {}".format(
clfs[-1].tree_.node_count, ccp_alphas[-1]
)
)
Number of nodes in the last tree is: 1 with ccp_alpha: 0.08117914389136949
clfs = clfs[:-1]
ccp_alphas = ccp_alphas[:-1]
node_counts = [clf.tree_.node_count for clf in clfs]
depth = [clf.tree_.max_depth for clf in clfs]
fig, ax = plt.subplots(2, 1, figsize=(10, 7))
ax[0].plot(ccp_alphas, node_counts, marker="o", drawstyle="steps-post")
ax[0].set_xlabel("alpha")
ax[0].set_ylabel("number of nodes")
ax[0].set_title("Number of nodes vs alpha")
ax[1].plot(ccp_alphas, depth, marker="o", drawstyle="steps-post")
ax[1].set_xlabel("alpha")
ax[1].set_ylabel("depth of tree")
ax[1].set_title("Depth vs alpha")
fig.tight_layout()
f1_train = []
for clf in clfs:
pred_train = clf.predict(X_train)
values_train = f1_score(y_train, pred_train)
f1_train.append(values_train)
f1_test = []
for clf in clfs:
pred_test = clf.predict(X_test)
values_test = f1_score(y_test, pred_test)
f1_test.append(values_test)
fig, ax = plt.subplots(figsize=(15, 5))
ax.set_xlabel("alpha")
ax.set_ylabel("F1 Score")
ax.set_title("F1 Score vs alpha for training and testing sets")
ax.plot(ccp_alphas, f1_train, marker="o", label="train", drawstyle="steps-post")
ax.plot(ccp_alphas, f1_test, marker="o", label="test", drawstyle="steps-post")
ax.legend()
plt.show()
index_best_model = np.argmax(f1_test)
best_model = clfs[index_best_model]
print(best_model)
DecisionTreeClassifier(ccp_alpha=0.0001229122417153721, class_weight='balanced',
random_state=1)
confusion_matrix_sklearn(best_model, X_train, y_train)
decision_tree_post_perf_train = model_performance_classification_sklearn(
best_model, X_train, y_train
)
decision_tree_post_perf_train
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.896542 | 0.902786 | 0.806279 | 0.851808 |
confusion_matrix_sklearn(best_model, X_test, y_test)
decision_tree_post_perf_test = model_performance_classification_sklearn(
best_model, X_test, y_test
)
decision_tree_post_perf_test
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.866948 | 0.857751 | 0.761341 | 0.806676 |
plt.figure(figsize=(20, 10))
out = tree.plot_tree(
best_model,
feature_names=feature_names,
filled=True,
fontsize=9,
node_ids=False,
class_names=None,
)
for o in out:
arrow = o.arrow_patch
if arrow is not None:
arrow.set_edgecolor("black")
arrow.set_linewidth(1)
plt.show()
# Text report showing the rules of a decision tree -
print(tree.export_text(best_model, feature_names=feature_names, show_weights=True))
|--- lead_time <= 151.50 | |--- no_of_special_requests <= 0.50 | | |--- market_segment_type_Online <= 0.50 | | | |--- lead_time <= 90.50 | | | | |--- no_of_weekend_nights <= 0.50 | | | | | |--- avg_price_per_room <= 179.47 | | | | | | |--- market_segment_type_Offline <= 0.50 | | | | | | | |--- lead_time <= 16.50 | | | | | | | | |--- avg_price_per_room <= 68.50 | | | | | | | | | |--- weights: [207.26, 10.63] class: 0.0 | | | | | | | | |--- avg_price_per_room > 68.50 | | | | | | | | | |--- arrival_date <= 29.50 | | | | | | | | | | |--- no_of_adults <= 1.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- no_of_adults > 1.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | |--- arrival_date > 29.50 | | | | | | | | | | |--- weights: [2.24, 7.59] class: 1.0 | | | | | | | |--- lead_time > 16.50 | | | | | | | | |--- avg_price_per_room <= 135.00 | | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | | |--- no_of_previous_bookings_not_canceled <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- no_of_previous_bookings_not_canceled > 0.50 | | | | | | | | | | | |--- weights: [11.18, 0.00] class: 0.0 | | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | | |--- weights: [21.62, 0.00] class: 0.0 | | | | | | | | |--- avg_price_per_room > 135.00 | | | | | | | | | |--- weights: [0.00, 12.14] class: 1.0 | | | | | | |--- market_segment_type_Offline > 0.50 | | | | | | | |--- weights: [1197.36, 0.00] class: 0.0 | | | | | |--- avg_price_per_room > 179.47 | | | | | | |--- weights: [2.98, 25.81] class: 1.0 | | | | |--- no_of_weekend_nights > 0.50 | | | | | |--- lead_time <= 68.50 | | | | | | |--- arrival_month <= 9.50 | | | | | | | |--- avg_price_per_room <= 63.29 | | | | | | | | |--- arrival_date <= 20.50 | | | | | | | | | |--- type_of_meal_plan <= 0.00 | | | | | | | | | | |--- weights: [0.75, 3.04] class: 1.0 | | | | | | | | | |--- type_of_meal_plan > 0.00 | | | | | | | | | | |--- weights: [41.75, 0.00] class: 0.0 | | | | | | | | |--- arrival_date > 20.50 | | | | | | | | | |--- avg_price_per_room <= 59.75 | | | | | | | | | | |--- arrival_date <= 23.50 | | | | | | | | | | | |--- weights: [1.49, 12.14] class: 1.0 | | | | | | | | | | |--- arrival_date > 23.50 | | | | | | | | | | | |--- weights: [14.91, 1.52] class: 0.0 | | | | | | | | | |--- avg_price_per_room > 59.75 | | | | | | | | | | |--- lead_time <= 44.00 | | | | | | | | | | | |--- weights: [0.75, 59.21] class: 1.0 | | | | | | | | | | |--- lead_time > 44.00 | | | | | | | | | | | |--- weights: [3.73, 0.00] class: 0.0 | | | | | | | |--- avg_price_per_room > 63.29 | | | | | | | | |--- no_of_weekend_nights <= 3.50 | | | | | | | | | |--- lead_time <= 59.50 | | | | | | | | | | |--- arrival_month <= 7.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- arrival_month > 7.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- lead_time > 59.50 | | | | | | | | | | |--- arrival_month <= 5.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- arrival_month > 5.50 | | | | | | | | | | | |--- weights: [20.13, 0.00] class: 0.0 | | | | | | | | |--- no_of_weekend_nights > 3.50 | | | | | | | | | |--- weights: [0.75, 15.18] class: 1.0 | | | | | | |--- arrival_month > 9.50 | | | | | | | |--- market_segment_type_Aviation <= 0.50 | | | | | | | | |--- weights: [401.85, 21.25] class: 0.0 | | | | | | | |--- market_segment_type_Aviation > 0.50 | | | | | | | | |--- arrival_date <= 7.50 | | | | | | | | | |--- weights: [0.75, 4.55] class: 1.0 | | | | | | | | |--- arrival_date > 7.50 | | | | | | | | | |--- weights: [10.44, 1.52] class: 0.0 | | | | | |--- lead_time > 68.50 | | | | | | |--- avg_price_per_room <= 99.98 | | | | | | | |--- arrival_month <= 3.50 | | | | | | | | |--- avg_price_per_room <= 62.50 | | | | | | | | | |--- weights: [15.66, 0.00] class: 0.0 | | | | | | | | |--- avg_price_per_room > 62.50 | | | | | | | | | |--- avg_price_per_room <= 80.38 | | | | | | | | | | |--- weights: [8.20, 25.81] class: 1.0 | | | | | | | | | |--- avg_price_per_room > 80.38 | | | | | | | | | | |--- weights: [3.73, 0.00] class: 0.0 | | | | | | | |--- arrival_month > 3.50 | | | | | | | | |--- no_of_week_nights <= 2.50 | | | | | | | | | |--- weights: [55.17, 3.04] class: 0.0 | | | | | | | | |--- no_of_week_nights > 2.50 | | | | | | | | | |--- lead_time <= 73.50 | | | | | | | | | | |--- weights: [0.00, 4.55] class: 1.0 | | | | | | | | | |--- lead_time > 73.50 | | | | | | | | | | |--- weights: [21.62, 4.55] class: 0.0 | | | | | | |--- avg_price_per_room > 99.98 | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | |--- weights: [8.95, 0.00] class: 0.0 | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | |--- avg_price_per_room <= 132.43 | | | | | | | | | |--- weights: [9.69, 122.97] class: 1.0 | | | | | | | | |--- avg_price_per_room > 132.43 | | | | | | | | | |--- weights: [6.71, 0.00] class: 0.0 | | | |--- lead_time > 90.50 | | | | |--- lead_time <= 117.50 | | | | | |--- avg_price_per_room <= 93.58 | | | | | | |--- avg_price_per_room <= 75.07 | | | | | | | |--- no_of_week_nights <= 2.50 | | | | | | | | |--- avg_price_per_room <= 58.75 | | | | | | | | | |--- weights: [5.96, 0.00] class: 0.0 | | | | | | | | |--- avg_price_per_room > 58.75 | | | | | | | | | |--- no_of_previous_cancellations <= 0.50 | | | | | | | | | | |--- arrival_month <= 4.50 | | | | | | | | | | | |--- weights: [2.24, 118.41] class: 1.0 | | | | | | | | | | |--- arrival_month > 4.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | |--- no_of_previous_cancellations > 0.50 | | | | | | | | | | |--- weights: [4.47, 0.00] class: 0.0 | | | | | | | |--- no_of_week_nights > 2.50 | | | | | | | | |--- arrival_date <= 11.50 | | | | | | | | | |--- weights: [31.31, 0.00] class: 0.0 | | | | | | | | |--- arrival_date > 11.50 | | | | | | | | | |--- weights: [29.08, 15.18] class: 0.0 | | | | | | |--- avg_price_per_room > 75.07 | | | | | | | |--- arrival_month <= 3.50 | | | | | | | | |--- weights: [59.64, 3.04] class: 0.0 | | | | | | | |--- arrival_month > 3.50 | | | | | | | | |--- arrival_month <= 4.50 | | | | | | | | | |--- weights: [1.49, 16.70] class: 1.0 | | | | | | | | |--- arrival_month > 4.50 | | | | | | | | | |--- no_of_adults <= 1.50 | | | | | | | | | | |--- avg_price_per_room <= 86.00 | | | | | | | | | | | |--- weights: [2.24, 16.70] class: 1.0 | | | | | | | | | | |--- avg_price_per_room > 86.00 | | | | | | | | | | | |--- weights: [8.95, 3.04] class: 0.0 | | | | | | | | | |--- no_of_adults > 1.50 | | | | | | | | | | |--- arrival_date <= 22.50 | | | | | | | | | | | |--- weights: [44.73, 4.55] class: 0.0 | | | | | | | | | | |--- arrival_date > 22.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | |--- avg_price_per_room > 93.58 | | | | | | |--- arrival_date <= 11.50 | | | | | | | |--- no_of_week_nights <= 1.50 | | | | | | | | |--- weights: [16.40, 39.47] class: 1.0 | | | | | | | |--- no_of_week_nights > 1.50 | | | | | | | | |--- weights: [20.13, 6.07] class: 0.0 | | | | | | |--- arrival_date > 11.50 | | | | | | | |--- avg_price_per_room <= 102.09 | | | | | | | | |--- weights: [5.22, 144.22] class: 1.0 | | | | | | | |--- avg_price_per_room > 102.09 | | | | | | | | |--- avg_price_per_room <= 109.50 | | | | | | | | | |--- no_of_week_nights <= 1.50 | | | | | | | | | | |--- weights: [0.75, 16.70] class: 1.0 | | | | | | | | | |--- no_of_week_nights > 1.50 | | | | | | | | | | |--- weights: [33.55, 0.00] class: 0.0 | | | | | | | | |--- avg_price_per_room > 109.50 | | | | | | | | | |--- avg_price_per_room <= 124.25 | | | | | | | | | | |--- weights: [2.98, 75.91] class: 1.0 | | | | | | | | | |--- avg_price_per_room > 124.25 | | | | | | | | | | |--- weights: [3.73, 3.04] class: 0.0 | | | | |--- lead_time > 117.50 | | | | | |--- no_of_week_nights <= 1.50 | | | | | | |--- arrival_date <= 7.50 | | | | | | | |--- weights: [38.02, 0.00] class: 0.0 | | | | | | |--- arrival_date > 7.50 | | | | | | | |--- avg_price_per_room <= 93.58 | | | | | | | | |--- avg_price_per_room <= 65.38 | | | | | | | | | |--- weights: [0.00, 4.55] class: 1.0 | | | | | | | | |--- avg_price_per_room > 65.38 | | | | | | | | | |--- weights: [24.60, 3.04] class: 0.0 | | | | | | | |--- avg_price_per_room > 93.58 | | | | | | | | |--- arrival_date <= 28.00 | | | | | | | | | |--- weights: [14.91, 72.87] class: 1.0 | | | | | | | | |--- arrival_date > 28.00 | | | | | | | | | |--- weights: [9.69, 1.52] class: 0.0 | | | | | |--- no_of_week_nights > 1.50 | | | | | | |--- no_of_adults <= 1.50 | | | | | | | |--- weights: [84.25, 0.00] class: 0.0 | | | | | | |--- no_of_adults > 1.50 | | | | | | | |--- lead_time <= 125.50 | | | | | | | | |--- avg_price_per_room <= 90.85 | | | | | | | | | |--- avg_price_per_room <= 87.50 | | | | | | | | | | |--- weights: [13.42, 13.66] class: 1.0 | | | | | | | | | |--- avg_price_per_room > 87.50 | | | | | | | | | | |--- weights: [0.00, 15.18] class: 1.0 | | | | | | | | |--- avg_price_per_room > 90.85 | | | | | | | | | |--- weights: [10.44, 0.00] class: 0.0 | | | | | | | |--- lead_time > 125.50 | | | | | | | | |--- arrival_date <= 19.50 | | | | | | | | | |--- weights: [58.15, 18.22] class: 0.0 | | | | | | | | |--- arrival_date > 19.50 | | | | | | | | | |--- weights: [61.88, 1.52] class: 0.0 | | |--- market_segment_type_Online > 0.50 | | | |--- lead_time <= 13.50 | | | | |--- avg_price_per_room <= 99.44 | | | | | |--- arrival_month <= 1.50 | | | | | | |--- weights: [92.45, 0.00] class: 0.0 | | | | | |--- arrival_month > 1.50 | | | | | | |--- arrival_month <= 8.50 | | | | | | | |--- no_of_weekend_nights <= 1.50 | | | | | | | | |--- avg_price_per_room <= 70.05 | | | | | | | | | |--- weights: [31.31, 0.00] class: 0.0 | | | | | | | | |--- avg_price_per_room > 70.05 | | | | | | | | | |--- lead_time <= 5.50 | | | | | | | | | | |--- no_of_adults <= 1.50 | | | | | | | | | | | |--- weights: [38.77, 1.52] class: 0.0 | | | | | | | | | | |--- no_of_adults > 1.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- lead_time > 5.50 | | | | | | | | | | |--- arrival_date <= 3.50 | | | | | | | | | | | |--- weights: [6.71, 0.00] class: 0.0 | | | | | | | | | | |--- arrival_date > 3.50 | | | | | | | | | | | |--- weights: [34.30, 40.99] class: 1.0 | | | | | | | |--- no_of_weekend_nights > 1.50 | | | | | | | | |--- no_of_adults <= 1.50 | | | | | | | | | |--- weights: [0.00, 19.74] class: 1.0 | | | | | | | | |--- no_of_adults > 1.50 | | | | | | | | | |--- lead_time <= 2.50 | | | | | | | | | | |--- avg_price_per_room <= 74.21 | | | | | | | | | | | |--- weights: [0.75, 3.04] class: 1.0 | | | | | | | | | | |--- avg_price_per_room > 74.21 | | | | | | | | | | | |--- weights: [9.69, 0.00] class: 0.0 | | | | | | | | | |--- lead_time > 2.50 | | | | | | | | | | |--- weights: [4.47, 10.63] class: 1.0 | | | | | | |--- arrival_month > 8.50 | | | | | | | |--- no_of_week_nights <= 3.50 | | | | | | | | |--- weights: [155.07, 6.07] class: 0.0 | | | | | | | |--- no_of_week_nights > 3.50 | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | |--- weights: [3.73, 10.63] class: 1.0 | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | |--- weights: [7.46, 0.00] class: 0.0 | | | | |--- avg_price_per_room > 99.44 | | | | | |--- lead_time <= 3.50 | | | | | | |--- avg_price_per_room <= 178.78 | | | | | | | |--- no_of_week_nights <= 4.50 | | | | | | | | |--- arrival_month <= 5.50 | | | | | | | | | |--- weights: [58.15, 30.36] class: 0.0 | | | | | | | | |--- arrival_month > 5.50 | | | | | | | | | |--- weights: [145.38, 22.77] class: 0.0 | | | | | | | |--- no_of_week_nights > 4.50 | | | | | | | | |--- weights: [0.00, 6.07] class: 1.0 | | | | | | |--- avg_price_per_room > 178.78 | | | | | | | |--- weights: [16.40, 25.81] class: 1.0 | | | | | |--- lead_time > 3.50 | | | | | | |--- arrival_month <= 8.50 | | | | | | | |--- avg_price_per_room <= 119.25 | | | | | | | | |--- avg_price_per_room <= 118.50 | | | | | | | | | |--- weights: [18.64, 59.21] class: 1.0 | | | | | | | | |--- avg_price_per_room > 118.50 | | | | | | | | | |--- weights: [8.20, 1.52] class: 0.0 | | | | | | | |--- avg_price_per_room > 119.25 | | | | | | | | |--- weights: [34.30, 171.55] class: 1.0 | | | | | | |--- arrival_month > 8.50 | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | |--- weights: [26.09, 1.52] class: 0.0 | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | |--- arrival_date <= 14.00 | | | | | | | | | | |--- weights: [9.69, 36.43] class: 1.0 | | | | | | | | | |--- arrival_date > 14.00 | | | | | | | | | | |--- no_of_weekend_nights <= 0.50 | | | | | | | | | | | |--- weights: [11.18, 0.00] class: 0.0 | | | | | | | | | | |--- no_of_weekend_nights > 0.50 | | | | | | | | | | | |--- weights: [8.95, 10.63] class: 1.0 | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | |--- weights: [15.66, 0.00] class: 0.0 | | | |--- lead_time > 13.50 | | | | |--- required_car_parking_space <= 0.50 | | | | | |--- avg_price_per_room <= 71.92 | | | | | | |--- avg_price_per_room <= 59.43 | | | | | | | |--- lead_time <= 84.50 | | | | | | | | |--- weights: [50.70, 7.59] class: 0.0 | | | | | | | |--- lead_time > 84.50 | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | |--- arrival_date <= 27.00 | | | | | | | | | | |--- lead_time <= 131.50 | | | | | | | | | | | |--- weights: [0.75, 15.18] class: 1.0 | | | | | | | | | | |--- lead_time > 131.50 | | | | | | | | | | | |--- weights: [2.24, 0.00] class: 0.0 | | | | | | | | | |--- arrival_date > 27.00 | | | | | | | | | | |--- weights: [3.73, 0.00] class: 0.0 | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | |--- weights: [10.44, 0.00] class: 0.0 | | | | | | |--- avg_price_per_room > 59.43 | | | | | | | |--- lead_time <= 25.50 | | | | | | | | |--- weights: [20.88, 6.07] class: 0.0 | | | | | | | |--- lead_time > 25.50 | | | | | | | | |--- avg_price_per_room <= 71.34 | | | | | | | | | |--- arrival_month <= 3.50 | | | | | | | | | | |--- lead_time <= 68.50 | | | | | | | | | | | |--- weights: [15.66, 78.94] class: 1.0 | | | | | | | | | | |--- lead_time > 68.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- arrival_month > 3.50 | | | | | | | | | | |--- lead_time <= 102.00 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- lead_time > 102.00 | | | | | | | | | | | |--- weights: [12.67, 3.04] class: 0.0 | | | | | | | | |--- avg_price_per_room > 71.34 | | | | | | | | | |--- weights: [11.18, 0.00] class: 0.0 | | | | | |--- avg_price_per_room > 71.92 | | | | | | |--- arrival_year <= 2017.50 | | | | | | | |--- lead_time <= 65.50 | | | | | | | | |--- avg_price_per_room <= 120.45 | | | | | | | | | |--- weights: [79.77, 9.11] class: 0.0 | | | | | | | | |--- avg_price_per_room > 120.45 | | | | | | | | | |--- no_of_week_nights <= 1.50 | | | | | | | | | | |--- weights: [3.73, 0.00] class: 0.0 | | | | | | | | | |--- no_of_week_nights > 1.50 | | | | | | | | | | |--- weights: [3.73, 12.14] class: 1.0 | | | | | | | |--- lead_time > 65.50 | | | | | | | | |--- type_of_meal_plan <= 1.50 | | | | | | | | | |--- arrival_date <= 27.50 | | | | | | | | | | |--- weights: [16.40, 47.06] class: 1.0 | | | | | | | | | |--- arrival_date > 27.50 | | | | | | | | | | |--- weights: [3.73, 0.00] class: 0.0 | | | | | | | | |--- type_of_meal_plan > 1.50 | | | | | | | | | |--- weights: [0.00, 63.76] class: 1.0 | | | | | | |--- arrival_year > 2017.50 | | | | | | | |--- avg_price_per_room <= 104.31 | | | | | | | | |--- lead_time <= 25.50 | | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | | |--- arrival_month <= 1.50 | | | | | | | | | | | |--- weights: [16.40, 0.00] class: 0.0 | | | | | | | | | | |--- arrival_month > 1.50 | | | | | | | | | | | |--- weights: [38.77, 118.41] class: 1.0 | | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | | |--- weights: [23.11, 0.00] class: 0.0 | | | | | | | | |--- lead_time > 25.50 | | | | | | | | | |--- type_of_meal_plan <= 0.00 | | | | | | | | | | |--- weights: [73.81, 411.41] class: 1.0 | | | | | | | | | |--- type_of_meal_plan > 0.00 | | | | | | | | | | |--- no_of_week_nights <= 1.50 | | | | | | | | | | | |--- weights: [39.51, 185.21] class: 1.0 | | | | | | | | | | |--- no_of_week_nights > 1.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | |--- avg_price_per_room > 104.31 | | | | | | | | |--- arrival_month <= 10.50 | | | | | | | | | |--- room_type_reserved_Room_Type 5 <= 0.50 | | | | | | | | | | |--- avg_price_per_room <= 144.76 | | | | | | | | | | | |--- truncated branch of depth 10 | | | | | | | | | | |--- avg_price_per_room > 144.76 | | | | | | | | | | | |--- weights: [71.57, 669.49] class: 1.0 | | | | | | | | | |--- room_type_reserved_Room_Type 5 > 0.50 | | | | | | | | | | |--- arrival_date <= 22.50 | | | | | | | | | | | |--- weights: [11.18, 6.07] class: 0.0 | | | | | | | | | | |--- arrival_date > 22.50 | | | | | | | | | | | |--- weights: [0.75, 9.11] class: 1.0 | | | | | | | | |--- arrival_month > 10.50 | | | | | | | | | |--- avg_price_per_room <= 168.06 | | | | | | | | | | |--- lead_time <= 22.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- lead_time > 22.00 | | | | | | | | | | | |--- weights: [17.15, 83.50] class: 1.0 | | | | | | | | | |--- avg_price_per_room > 168.06 | | | | | | | | | | |--- weights: [12.67, 6.07] class: 0.0 | | | | |--- required_car_parking_space > 0.50 | | | | | |--- weights: [48.46, 1.52] class: 0.0 | |--- no_of_special_requests > 0.50 | | |--- no_of_special_requests <= 1.50 | | | |--- market_segment_type_Online <= 0.50 | | | | |--- lead_time <= 102.50 | | | | | |--- type_of_meal_plan <= 0.00 | | | | | | |--- lead_time <= 63.00 | | | | | | | |--- weights: [15.66, 1.52] class: 0.0 | | | | | | |--- lead_time > 63.00 | | | | | | | |--- weights: [0.00, 7.59] class: 1.0 | | | | | |--- type_of_meal_plan > 0.00 | | | | | | |--- weights: [697.09, 9.11] class: 0.0 | | | | |--- lead_time > 102.50 | | | | | |--- no_of_week_nights <= 2.50 | | | | | | |--- lead_time <= 105.00 | | | | | | | |--- weights: [0.75, 6.07] class: 1.0 | | | | | | |--- lead_time > 105.00 | | | | | | | |--- weights: [31.31, 13.66] class: 0.0 | | | | | |--- no_of_week_nights > 2.50 | | | | | | |--- weights: [44.73, 3.04] class: 0.0 | | | |--- market_segment_type_Online > 0.50 | | | | |--- lead_time <= 8.50 | | | | | |--- lead_time <= 4.50 | | | | | | |--- no_of_weekend_nights <= 3.50 | | | | | | | |--- weights: [497.28, 40.99] class: 0.0 | | | | | | |--- no_of_weekend_nights > 3.50 | | | | | | | |--- weights: [0.75, 3.04] class: 1.0 | | | | | |--- lead_time > 4.50 | | | | | | |--- arrival_date <= 13.50 | | | | | | | |--- arrival_month <= 9.50 | | | | | | | | |--- weights: [58.90, 36.43] class: 0.0 | | | | | | | |--- arrival_month > 9.50 | | | | | | | | |--- weights: [33.55, 1.52] class: 0.0 | | | | | | |--- arrival_date > 13.50 | | | | | | | |--- type_of_meal_plan <= 0.00 | | | | | | | | |--- avg_price_per_room <= 126.33 | | | | | | | | | |--- weights: [32.80, 3.04] class: 0.0 | | | | | | | | |--- avg_price_per_room > 126.33 | | | | | | | | | |--- weights: [9.69, 13.66] class: 1.0 | | | | | | | |--- type_of_meal_plan > 0.00 | | | | | | | | |--- weights: [123.76, 9.11] class: 0.0 | | | | |--- lead_time > 8.50 | | | | | |--- required_car_parking_space <= 0.50 | | | | | | |--- avg_price_per_room <= 118.55 | | | | | | | |--- lead_time <= 61.50 | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | |--- arrival_month <= 1.50 | | | | | | | | | | |--- weights: [70.08, 0.00] class: 0.0 | | | | | | | | | |--- arrival_month > 1.50 | | | | | | | | | | |--- no_of_week_nights <= 4.50 | | | | | | | | | | | |--- truncated branch of depth 11 | | | | | | | | | | |--- no_of_week_nights > 4.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | |--- weights: [126.74, 1.52] class: 0.0 | | | | | | | |--- lead_time > 61.50 | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | |--- arrival_month <= 7.50 | | | | | | | | | | |--- weights: [4.47, 57.69] class: 1.0 | | | | | | | | | |--- arrival_month > 7.50 | | | | | | | | | | |--- lead_time <= 66.50 | | | | | | | | | | | |--- weights: [5.22, 0.00] class: 0.0 | | | | | | | | | | |--- lead_time > 66.50 | | | | | | | | | | | |--- weights: [37.28, 54.65] class: 1.0 | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | |--- arrival_month <= 9.50 | | | | | | | | | | |--- avg_price_per_room <= 71.93 | | | | | | | | | | | |--- weights: [54.43, 3.04] class: 0.0 | | | | | | | | | | |--- avg_price_per_room > 71.93 | | | | | | | | | | | |--- truncated branch of depth 10 | | | | | | | | | |--- arrival_month > 9.50 | | | | | | | | | | |--- no_of_week_nights <= 1.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- no_of_week_nights > 1.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | |--- avg_price_per_room > 118.55 | | | | | | | |--- arrival_month <= 8.50 | | | | | | | | |--- arrival_date <= 19.50 | | | | | | | | | |--- avg_price_per_room <= 177.15 | | | | | | | | | | |--- avg_price_per_room <= 118.98 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- avg_price_per_room > 118.98 | | | | | | | | | | | |--- truncated branch of depth 7 | | | | | | | | | |--- avg_price_per_room > 177.15 | | | | | | | | | | |--- arrival_date <= 7.00 | | | | | | | | | | | |--- weights: [6.71, 0.00] class: 0.0 | | | | | | | | | | |--- arrival_date > 7.00 | | | | | | | | | | | |--- weights: [12.67, 24.29] class: 1.0 | | | | | | | | |--- arrival_date > 19.50 | | | | | | | | | |--- arrival_date <= 27.50 | | | | | | | | | | |--- avg_price_per_room <= 121.20 | | | | | | | | | | | |--- weights: [18.64, 6.07] class: 0.0 | | | | | | | | | | |--- avg_price_per_room > 121.20 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | |--- arrival_date > 27.50 | | | | | | | | | | |--- lead_time <= 55.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- lead_time > 55.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | |--- arrival_month > 8.50 | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | |--- arrival_month <= 9.50 | | | | | | | | | | |--- weights: [11.93, 10.63] class: 0.0 | | | | | | | | | |--- arrival_month > 9.50 | | | | | | | | | | |--- weights: [37.28, 0.00] class: 0.0 | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | | |--- avg_price_per_room <= 119.20 | | | | | | | | | | | |--- weights: [9.69, 28.84] class: 1.0 | | | | | | | | | | |--- avg_price_per_room > 119.20 | | | | | | | | | | | |--- truncated branch of depth 12 | | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | | |--- lead_time <= 100.00 | | | | | | | | | | | |--- weights: [49.95, 0.00] class: 0.0 | | | | | | | | | | |--- lead_time > 100.00 | | | | | | | | | | | |--- weights: [0.75, 18.22] class: 1.0 | | | | | |--- required_car_parking_space > 0.50 | | | | | | |--- weights: [134.20, 1.52] class: 0.0 | | |--- no_of_special_requests > 1.50 | | | |--- lead_time <= 90.50 | | | | |--- no_of_week_nights <= 3.50 | | | | | |--- weights: [1585.04, 0.00] class: 0.0 | | | | |--- no_of_week_nights > 3.50 | | | | | |--- no_of_special_requests <= 2.50 | | | | | | |--- lead_time <= 6.50 | | | | | | | |--- weights: [32.06, 1.52] class: 0.0 | | | | | | |--- lead_time > 6.50 | | | | | | | |--- room_type_reserved_Room_Type 4 <= 0.50 | | | | | | | | |--- weights: [103.63, 50.10] class: 0.0 | | | | | | | |--- room_type_reserved_Room_Type 4 > 0.50 | | | | | | | | |--- weights: [44.73, 6.07] class: 0.0 | | | | | |--- no_of_special_requests > 2.50 | | | | | | |--- weights: [52.19, 0.00] class: 0.0 | | | |--- lead_time > 90.50 | | | | |--- no_of_special_requests <= 2.50 | | | | | |--- arrival_month <= 8.50 | | | | | | |--- lead_time <= 150.50 | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | |--- weights: [8.20, 12.14] class: 1.0 | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | |--- no_of_children <= 0.50 | | | | | | | | | |--- avg_price_per_room <= 157.50 | | | | | | | | | | |--- weights: [140.91, 13.66] class: 0.0 | | | | | | | | | |--- avg_price_per_room > 157.50 | | | | | | | | | | |--- arrival_date <= 12.50 | | | | | | | | | | | |--- weights: [1.49, 6.07] class: 1.0 | | | | | | | | | | |--- arrival_date > 12.50 | | | | | | | | | | | |--- weights: [5.22, 0.00] class: 0.0 | | | | | | | | |--- no_of_children > 0.50 | | | | | | | | | |--- weights: [27.59, 16.70] class: 0.0 | | | | | | |--- lead_time > 150.50 | | | | | | | |--- weights: [1.49, 7.59] class: 1.0 | | | | | |--- arrival_month > 8.50 | | | | | | |--- avg_price_per_room <= 153.15 | | | | | | | |--- room_type_reserved_Room_Type 2 <= 0.50 | | | | | | | | |--- avg_price_per_room <= 71.12 | | | | | | | | | |--- weights: [3.73, 0.00] class: 0.0 | | | | | | | | |--- avg_price_per_room > 71.12 | | | | | | | | | |--- avg_price_per_room <= 90.42 | | | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | | | |--- weights: [12.67, 7.59] class: 0.0 | | | | | | | | | |--- avg_price_per_room > 90.42 | | | | | | | | | | |--- weights: [64.12, 60.72] class: 0.0 | | | | | | | |--- room_type_reserved_Room_Type 2 > 0.50 | | | | | | | | |--- weights: [5.96, 0.00] class: 0.0 | | | | | | |--- avg_price_per_room > 153.15 | | | | | | | |--- weights: [12.67, 3.04] class: 0.0 | | | | |--- no_of_special_requests > 2.50 | | | | | |--- weights: [67.10, 0.00] class: 0.0 |--- lead_time > 151.50 | |--- avg_price_per_room <= 100.04 | | |--- no_of_special_requests <= 0.50 | | | |--- no_of_adults <= 1.50 | | | | |--- market_segment_type_Online <= 0.50 | | | | | |--- lead_time <= 163.50 | | | | | | |--- lead_time <= 160.50 | | | | | | | |--- weights: [2.98, 0.00] class: 0.0 | | | | | | |--- lead_time > 160.50 | | | | | | | |--- weights: [0.75, 24.29] class: 1.0 | | | | | |--- lead_time > 163.50 | | | | | | |--- no_of_week_nights <= 4.50 | | | | | | | |--- lead_time <= 173.00 | | | | | | | | |--- arrival_date <= 3.50 | | | | | | | | | |--- weights: [46.97, 9.11] class: 0.0 | | | | | | | | |--- arrival_date > 3.50 | | | | | | | | | |--- no_of_weekend_nights <= 1.00 | | | | | | | | | | |--- weights: [0.00, 13.66] class: 1.0 | | | | | | | | | |--- no_of_weekend_nights > 1.00 | | | | | | | | | | |--- weights: [2.24, 0.00] class: 0.0 | | | | | | | |--- lead_time > 173.00 | | | | | | | | |--- arrival_month <= 5.50 | | | | | | | | | |--- avg_price_per_room <= 88.00 | | | | | | | | | | |--- no_of_week_nights <= 1.50 | | | | | | | | | | | |--- weights: [0.00, 3.04] class: 1.0 | | | | | | | | | | |--- no_of_week_nights > 1.50 | | | | | | | | | | | |--- weights: [6.71, 0.00] class: 0.0 | | | | | | | | | |--- avg_price_per_room > 88.00 | | | | | | | | | | |--- weights: [0.00, 4.55] class: 1.0 | | | | | | | | |--- arrival_month > 5.50 | | | | | | | | | |--- lead_time <= 278.00 | | | | | | | | | | |--- weights: [143.15, 4.55] class: 0.0 | | | | | | | | | |--- lead_time > 278.00 | | | | | | | | | | |--- avg_price_per_room <= 83.50 | | | | | | | | | | | |--- weights: [20.13, 0.00] class: 0.0 | | | | | | | | | | |--- avg_price_per_room > 83.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | |--- no_of_week_nights > 4.50 | | | | | | | |--- weights: [2.24, 12.14] class: 1.0 | | | | |--- market_segment_type_Online > 0.50 | | | | | |--- avg_price_per_room <= 35.22 | | | | | | |--- lead_time <= 285.50 | | | | | | | |--- weights: [8.20, 0.00] class: 0.0 | | | | | | |--- lead_time > 285.50 | | | | | | | |--- weights: [0.75, 4.55] class: 1.0 | | | | | |--- avg_price_per_room > 35.22 | | | | | | |--- weights: [0.75, 95.64] class: 1.0 | | | |--- no_of_adults > 1.50 | | | | |--- avg_price_per_room <= 82.47 | | | | | |--- market_segment_type_Offline <= 0.50 | | | | | | |--- weights: [2.98, 282.37] class: 1.0 | | | | | |--- market_segment_type_Offline > 0.50 | | | | | | |--- arrival_month <= 11.50 | | | | | | | |--- lead_time <= 244.00 | | | | | | | | |--- no_of_week_nights <= 1.50 | | | | | | | | | |--- no_of_weekend_nights <= 1.50 | | | | | | | | | | |--- lead_time <= 166.50 | | | | | | | | | | | |--- weights: [2.24, 0.00] class: 0.0 | | | | | | | | | | |--- lead_time > 166.50 | | | | | | | | | | | |--- weights: [2.24, 57.69] class: 1.0 | | | | | | | | | |--- no_of_weekend_nights > 1.50 | | | | | | | | | | |--- weights: [17.89, 0.00] class: 0.0 | | | | | | | | |--- no_of_week_nights > 1.50 | | | | | | | | | |--- no_of_weekend_nights <= 0.50 | | | | | | | | | | |--- arrival_month <= 9.50 | | | | | | | | | | | |--- weights: [11.18, 3.04] class: 0.0 | | | | | | | | | | |--- arrival_month > 9.50 | | | | | | | | | | | |--- weights: [0.00, 12.14] class: 1.0 | | | | | | | | | |--- no_of_weekend_nights > 0.50 | | | | | | | | | | |--- weights: [75.30, 12.14] class: 0.0 | | | | | | | |--- lead_time > 244.00 | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | |--- weights: [25.35, 0.00] class: 0.0 | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | |--- avg_price_per_room <= 80.38 | | | | | | | | | | |--- no_of_week_nights <= 3.50 | | | | | | | | | | | |--- weights: [11.18, 264.15] class: 1.0 | | | | | | | | | | |--- no_of_week_nights > 3.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- avg_price_per_room > 80.38 | | | | | | | | | | |--- weights: [7.46, 0.00] class: 0.0 | | | | | | |--- arrival_month > 11.50 | | | | | | | |--- weights: [46.22, 0.00] class: 0.0 | | | | |--- avg_price_per_room > 82.47 | | | | | |--- no_of_adults <= 2.50 | | | | | | |--- type_of_meal_plan <= 1.50 | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | |--- room_type_reserved_Room_Type 4 <= 0.50 | | | | | | | | | |--- weights: [8.95, 982.22] class: 1.0 | | | | | | | | |--- room_type_reserved_Room_Type 4 > 0.50 | | | | | | | | | |--- market_segment_type_Offline <= 0.50 | | | | | | | | | | |--- weights: [0.00, 10.63] class: 1.0 | | | | | | | | | |--- market_segment_type_Offline > 0.50 | | | | | | | | | | |--- weights: [4.47, 0.00] class: 0.0 | | | | | | | |--- arrival_month > 11.50 | | | | | | | | |--- market_segment_type_Online <= 0.50 | | | | | | | | | |--- weights: [5.22, 0.00] class: 0.0 | | | | | | | | |--- market_segment_type_Online > 0.50 | | | | | | | | | |--- weights: [0.00, 21.25] class: 1.0 | | | | | | |--- type_of_meal_plan > 1.50 | | | | | | | |--- lead_time <= 249.75 | | | | | | | | |--- weights: [0.00, 16.70] class: 1.0 | | | | | | | |--- lead_time > 249.75 | | | | | | | | |--- weights: [5.22, 0.00] class: 0.0 | | | | | |--- no_of_adults > 2.50 | | | | | | |--- weights: [5.22, 0.00] class: 0.0 | | |--- no_of_special_requests > 0.50 | | | |--- no_of_weekend_nights <= 0.50 | | | | |--- lead_time <= 180.50 | | | | | |--- lead_time <= 159.50 | | | | | | |--- arrival_month <= 8.50 | | | | | | | |--- weights: [5.96, 0.00] class: 0.0 | | | | | | |--- arrival_month > 8.50 | | | | | | | |--- weights: [1.49, 7.59] class: 1.0 | | | | | |--- lead_time > 159.50 | | | | | | |--- arrival_date <= 1.50 | | | | | | | |--- weights: [1.49, 3.04] class: 1.0 | | | | | | |--- arrival_date > 1.50 | | | | | | | |--- weights: [35.79, 1.52] class: 0.0 | | | | |--- lead_time > 180.50 | | | | | |--- no_of_special_requests <= 2.50 | | | | | | |--- market_segment_type_Online <= 0.50 | | | | | | | |--- no_of_adults <= 2.50 | | | | | | | | |--- weights: [12.67, 3.04] class: 0.0 | | | | | | | |--- no_of_adults > 2.50 | | | | | | | | |--- weights: [0.00, 3.04] class: 1.0 | | | | | | |--- market_segment_type_Online > 0.50 | | | | | | | |--- weights: [7.46, 206.46] class: 1.0 | | | | | |--- no_of_special_requests > 2.50 | | | | | | |--- weights: [8.95, 0.00] class: 0.0 | | | |--- no_of_weekend_nights > 0.50 | | | | |--- market_segment_type_Offline <= 0.50 | | | | | |--- arrival_month <= 11.50 | | | | | | |--- avg_price_per_room <= 76.48 | | | | | | | |--- weights: [46.97, 4.55] class: 0.0 | | | | | | |--- avg_price_per_room > 76.48 | | | | | | | |--- arrival_date <= 27.50 | | | | | | | | |--- no_of_week_nights <= 5.50 | | | | | | | | | |--- lead_time <= 233.00 | | | | | | | | | | |--- lead_time <= 152.50 | | | | | | | | | | | |--- weights: [1.49, 4.55] class: 1.0 | | | | | | | | | | |--- lead_time > 152.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- lead_time > 233.00 | | | | | | | | | | |--- weights: [23.11, 19.74] class: 0.0 | | | | | | | | |--- no_of_week_nights > 5.50 | | | | | | | | | |--- weights: [8.95, 16.70] class: 1.0 | | | | | | | |--- arrival_date > 27.50 | | | | | | | | |--- no_of_week_nights <= 1.50 | | | | | | | | | |--- weights: [2.24, 15.18] class: 1.0 | | | | | | | | |--- no_of_week_nights > 1.50 | | | | | | | | | |--- lead_time <= 269.00 | | | | | | | | | | |--- lead_time <= 176.00 | | | | | | | | | | | |--- weights: [2.24, 7.59] class: 1.0 | | | | | | | | | | |--- lead_time > 176.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- lead_time > 269.00 | | | | | | | | | | |--- weights: [0.00, 4.55] class: 1.0 | | | | | |--- arrival_month > 11.50 | | | | | | |--- arrival_date <= 14.50 | | | | | | | |--- weights: [8.20, 3.04] class: 0.0 | | | | | | |--- arrival_date > 14.50 | | | | | | | |--- weights: [11.18, 31.88] class: 1.0 | | | | |--- market_segment_type_Offline > 0.50 | | | | | |--- weights: [112.58, 7.59] class: 0.0 | |--- avg_price_per_room > 100.04 | | |--- arrival_month <= 11.50 | | | |--- no_of_special_requests <= 2.50 | | | | |--- weights: [0.00, 3200.19] class: 1.0 | | | |--- no_of_special_requests > 2.50 | | | | |--- weights: [23.11, 0.00] class: 0.0 | | |--- arrival_month > 11.50 | | | |--- no_of_special_requests <= 0.50 | | | | |--- weights: [35.04, 0.00] class: 0.0 | | | |--- no_of_special_requests > 0.50 | | | | |--- arrival_date <= 24.50 | | | | | |--- weights: [3.73, 0.00] class: 0.0 | | | | |--- arrival_date > 24.50 | | | | | |--- weights: [3.73, 22.77] class: 1.0
importances = best_model.feature_importances_
indices = np.argsort(importances)
plt.figure(figsize=(12, 12))
plt.title("Feature Importances")
plt.barh(range(len(indices)), importances[indices], color="violet", align="center")
plt.yticks(range(len(indices)), [feature_names[i] for i in indices])
plt.xlabel("Relative Importance")
plt.show()
# training performance comparison
models_train_comp_df = pd.concat(
[
decision_tree_perf_train.T,
decision_tree_tune_perf_train.T,
decision_tree_post_perf_train.T,
],
axis=1,
)
models_train_comp_df.columns = [
"Decision Tree sklearn",
"Decision Tree (Pre-Pruning)",
"Decision Tree (Post-Pruning)",
]
print("Training performance comparison:")
models_train_comp_df
Training performance comparison:
| Decision Tree sklearn | Decision Tree (Pre-Pruning) | Decision Tree (Post-Pruning) | |
|---|---|---|---|
| Accuracy | 0.994211 | 0.879923 | 0.896542 |
| Recall | 0.986608 | 0.856989 | 0.902786 |
| Precision | 0.995776 | 0.794568 | 0.806279 |
| F1 | 0.991171 | 0.824599 | 0.851808 |
# testing performance comparison
models_test_comp_df = pd.concat(
[
decision_tree_perf_test.T,
decision_tree_tune_perf_test.T,
decision_tree_post_perf_test.T,
],
axis=1,
)
models_test_comp_df.columns = [
"Decision Tree sklearn",
"Decision Tree (Pre-Pruning)",
"Decision Tree (Post-Pruning)",
]
print("Testing performance comparison:")
models_test_comp_df
Testing performance comparison:
| Decision Tree sklearn | Decision Tree (Pre-Pruning) | Decision Tree (Post-Pruning) | |
|---|---|---|---|
| Accuracy | 0.872278 | 0.869981 | 0.866948 |
| Recall | 0.808348 | 0.839012 | 0.857751 |
| Precision | 0.799270 | 0.777018 | 0.761341 |
| F1 | 0.803783 | 0.806826 | 0.806676 |
Feature selection were done on the data to make it more robust but it did not improve the model too much.
Different methods of threshold selection was done in logistic regression model to increase the f1-score as both recall and precision value were important for us.
Decision tree model also is checked to fit the best model on the data.
We visualized different trees and their confusion matrix to get a better understanding of the model. Easy interpretation is one of the key benefits of Decision Trees.
Decision tree model has better predicuion than logistic regression model as the f1 score in this model is increased up to 80%.
Different methods of pruning such as pre or post pruning was done on the data to improve the model and reduce the overfitting.
At the end the decision tree model has better predicion score on the data.
Acoording to the decision tree model:
The lead_time is the most important feature on the predictors. Based on the analysis, it shows that after the 5th moth of reservation the number of cancelation will be more than number of reservation. It is recommneded to make a limitation for reservation up to 5 month for the months that has more request (such as holidays) and limit the reservation time for the rest of the month to 3 month.
The next important feature is online type reservation which need more information to be investigated for improvement. May be because the online cancelation is more easy, the amount of cancelation in this segment is more. But it needs more information.
The next important features are no_of_special_request and avg_price_per_room which have the more effects on the cancelation. It is recommneded to work on special requst more to decrease the amount of canceation as one of the important factor on that and also making some flexible prices for the room reservation for different time of the year to be more compatative (Some seasonal offer or cycle offer for a special time).
The arrival month is the next important feature, which has more effect on the cancelation. Therefore with some interesting packages on the low demand month and with high cancelation, the number of cancelation will be decreased.
The next items are the no_of_week_nights and no_of_weekend_nights which may be due to high price. Playing with price during the weekends or weekdays or with special offer (like promotion codes), may be it could help to improve this item.
The importance of the other features can be discussed in the same way.
There are some opportunity like the parking reseravtion as it might be one of the important feature and it is not reserved by the customers too much. It is recommended to work on such a features which for example by some offers like leasing the car for the hotel, it might help to making cash flow for the hotel.
It is also recommended to use the feature selection for decision tree models to see what will happen and how it may help to improve the model.